Vol. IV 


1932 


THE ANNALS 

of 

MATHEMATICAL 

STATISTICS 


Printed in U. S. A. 


Copyright 1932 



Publinhed and Lithoprinted iy 

EDWARDS BROTHERS, INC. 
ANN ARBOR. MICH. 



EDITORIAL COMMITTEE 


H C C i\\ , Lei /to/ 

R S S>t’l<hf^n, *'^svkn /(/;?/ Eld It (n 


ylssnCIOfl' EditOIS 

Rill Ion K Camp 
Robfcfi I'l crirlt 1 son 
Hfiiold Hotellini^ 

H c^m \ Lfwis Rk‘ 1/: 
Hemy Schiilti 


J. W Edwaids. 




CONTENTS OF VOLUME IV 


On the System of Curves for which the Method of Moments 

is the Best Method of Fitting 1 

A. L. O’Toole 

On the Logarithmic Frequency Distribution and the Semi- 

logarithmic Correlation Surface 30 

Pae-Tsi Ymn 

A Simple Method for Calculating Mean Square Contingency 75 
Blitter B. Royer 

A Method of Determining the Constants in the Bim’odal 

Fourth Degree Exponential Function 79 

A. L O’Toole 

On the Tchebychef Inequality of Bernstein 94 

Cecil C. Craig 

On Correlation Surfaces of Sums with a Certain Number of 

Random Elements in Common 103 

Carl H. Fischer 

On the Correlation Between Certain Averages From Small 
Samples 127 

Allen T. Craig 

On the Degree of Approximation of Certain Quadrature For- 
mulas 143 

A. L. O’Toole 

Polynomial Approximation by the Method of Least Squares 155 
H. T. Davis 

The Precision of the Weighted Average 196 

H. M Ulcer Gnaetvska 

On Certain Relationships Between and^ For the Point 

Binomial ^ 216 

Margaret Merrcll 



CONTENTS OF VOLUME IV -Continued 


Note on ine Computation and Modification of Moments (Edi- 
torial ) . , . . 229 

fL C C arvcr 

The Extended Probability Theory for the Continuous Vari- 
able with Particular Application to the Linear Distribu- 
tion . . 241 

H P Lawther, Jr, 

On the EUnnnatiou of Systematic Errors due to Grouping , 263 
John K Abc}ncthy 

On MuUiplc anf[ [\nrtial Correlation Coefficients of a Cer- 
tain Sequence of Sums 278 

Carl H Fischer 

An Experiment Regarding the 285 

Selby Robinson 

On Sampling from Compound Populations 288 

Georijc M Btoivn 



THE AMERICAN STATISTICAL ASSOCIATION 
Vol. IV No. 


THE ANNALS 

of 

MATHEMATICAL 

STATISTICS 


SIX DOLLARS PEFP'ANNUM" 


FEBRUARY 


193a 



PUBLISHED quarterly BY 
AMERICAN STATISTICAL ASSOCIATION 

/>uMkiHan .Offkf^Edwards Brothers, Inc., Aon Arbor, Michigan 
Suiitiesf Offiee—'SSO Commerce Bldg., New Yorfe Univ., Ncw York, N. Y. 

' 'Balire# at secaiiti plats moiier at tite Postoffiff at Am Arbor, Mtch., 
under the Act of March 3rd> 1879. 


; , STATEMENT OF OWNERSHIP 

OY, COY,<SSS&?, CiY NMOCIS.T 2A, mz 

Aflflflcmtioft, Nev^r York City, New York. 

SQ. tkiivtffsi'ty qI Mfcbi^n. 

.Sr Sel4ion, Ann Arbori Michigan* 

Wi Edwafda* Ann A^bor, Michigan. 

Association* 530 Commerce Bldg,, New York 

<Sty. , 



ON THE SYSTEM OF CURVES FOR WHICH THE 
METHOD OF MOMENTS IS THE BEST 
METHOD OF FITTING 


by 

A. U. O’Tnoi.i. 

X}itif)nal Rcsodrch Felbiw. 

In Ml, IN. Fisher's paper' on the mathematical foundations 
of theoretical statistics the followin^^ statement is found: “1'he 
method of moments apj)llefl in fitting Pearsoniaii curve.s has an 
efficienc> exceedinjj 80 per centr only in the restricted reft[ion for 
uhich/6^ lies between tlie limits of 2.65 and 3,42 and for which 
does not exceed O.l. It was, of course, to he exi)ected that 
the first two moments would have 100 per c^nt. efficiency for tlie 
normal curve, for they hapi)en to be the optimum statistics for 
fitting the normal curve. That the moment coefiicieius ancl/I^ 
also tend to 100 per cetit. efficiency in this regn'on sugg;est^ that 
in the immediate neighl)orhood of the norma! curve the departures 
from normality sj)ecified l)y the l^earsonian formulas agree with 
Uuise of that system of curves for which the methfal of moments 
gives tlie solution of the method of maximum likelihood, 

'Phe ,system of curves for which the method of moinents is the 
f)est method of fitting may easily be deduced; fur if the fre<iuency 

d 

in the range hQy(x, ^)dx then ^^0$ y niust involve 

% only as polynomials up to the fourth degree; consequently 


‘Philosophical Transactions of the Ro>al Sttcicty of l.omlun, v^ol. 222 
scries A (1921), p, '355. 



z CUkl^BS FOR METHOD OF MOMENTS 

the convergence of the probability integral requiring that the 
coefficient of should be negative, and the five quantities 
a, connected by a single i elation, representing 

ihe tact that the total probability is unity." It is with these 
curves having a fourtli degree polynotnial in the exponent that 
tlie j)rebent paper is concerne<h 

The ihht stejj in the study of this system of frecjuencv funC' 
tions is to (hul an expression for the value of the iulegrul 

-01? 


In other Winds, it is uecessavx to know how the integral depends 
on the parametei s ^ j i . 

Iriince a de][)euds onh iiu the unit of measure of ^ it will be 
-»ufficienc for th<* moment to consider . Furthermore a 

linear transformation on x leaves the value of the integral un- 
changed, If wc replace by integral to lie considered 

liecomes 






Px^kje 


dx where 




-tio 


Consider now then the frequenc> curves ^ , 

. These curves are typically Inmodal and ma\ be classified accord- 
ing to the numhei and kind of modes. The positions of the modes 
are given by the solutions of the equation that is by the 

roots of the equation The discriminant of this 

cubic equation tells us thtit there will be three distinct real roots 
and thus two distinct maxima with a minimum between them ior 
the curve, that is two distinct modes for the quartic exponential 
curve, if Two roots will be real and equal if 

'I'he thr^ie roots will 1^ real and equal if 



./. L, OTOOLIi 


1 


, iIkj three roots ht\\\\:^ JC « O , In the ease of three real 
distinct loots, if two of the nxn.s aie equal in magnitude btn 
Opposite ip sign then ^ nnd the curve is symmetrical with 
respect t(j the y-axis. h p^Ojg-jfO there will he one real root 
and two nnaginarx roots gi\eii h\ tlie three cube roots of ^ 
That there will he a real masmium at the value ijf x given b\ 
the real cube root t)f ^ is easily seen fnim the nature of the 
cuive or 1)\ con si deling points at values of a; on each >ide of 
this real cube root of ^ . 

'X . , . 

ilenci* the lollowjng classe*' of curves and their lesjxrtice 
equations will hv considered 

rype J . y^/re 


The euive which is symmetrical with respect to they -unis 
ami has onlv due mode, this inudc being at X:fO, 

1 ype \i\ ys Ae , b>0 . 


The cm\e which syuimetiical with ic^pect l(t the y-uxis 
and has two distinct modes at 


Ty]x JI 1 : 




c/ a 


'I'hi' asymmetrical curve with (ii\c real nirtdc al 
1 \pe i\ c 


The general type of curse with the piuirtic exponent. 
Type I, 

First evaluate the delinite integial 


-4 


' e 

~ca 


cbx ’‘2 , 


e cbx. 



4 ' 


CURVES FOR MF/niOD OF MOMENTS 


i/4 , 

i,et oc^y , dx ^ dy I'hen 


'riuni 



Similarly it may be n tliat 






X 


dx^ O 


since the integrand is an odd i unction. 









4 '/ - / rdyZ-^rr^j. 

/ Ofi ^ 

a!;c~0, n. J? 





A. L O'TOOLE 


5 


Hence if the total frequency is unity then 










^ J^77 

^ approximately, 


■^3 ~ ^3/^0 ' 








Type II; 

Consider the definite integral 





6*0 



6 


CURyES FOR METHOD OP MOMENTS 


Integrate by parts letting 


e and ay ^ ay l-'hen 


«a? 

-Ofl -<w 


Xow obviously cannot be zero. Hence dividing by we 
find 7® 4jU^ therefore 


1) ss « 


( cannot be zero). 


Now thati6 is known (calculated from the given data by this 
last formula) it is possible in any particular problem to find by 
mechanical quadrature the value of the integral to any desired 
degree of approximation. The simple rectangle formula with even 
a small number of ordinates known will give a good approxima- 
tion. 

Return now to the integration by parts just performed. The 
result takes the form 


4 


7^ 


or 




-db 



o 


which is a Riccati- differential equation. Riccati’s equation is ^ 


* Johnson’s Differential Equations, p 227. 



A. L OrOOLE 


7 


It has a solution capable of 

dz^ dz 

expression in finite form in terms of elementary functions if m 
is the reciprocal of an odd positive integer. In our equation 
a*-l hence no finite solution for the differential equation is 
possible. That is no finite expression in terms of elementary func- 
tions can l>e obtained for . The solution of the Ricatti equation 
here is 

r . S-i£>^ 9-5-ib^ , ) 




^ 

To determine and ^ we note that when then 

and that when h^o then ziT ^ ^ 

0x0 / z 

It is worth while to make certain transformations on the 
differential equation 


Let « ip ^ . Then 

311 O, 


Let bV^.b . Then 



CURVES FOR METHOD OF MOMENTS 




di 




C7. 


Let V » ^ . Then 





o. 


Let where /a/7. Then 

This last equation is Bessel’s dififerential equation** with 
Hence its solution is 


4 4 ? 

^ rn^I/r7^£j‘JZ 

t,o ry rt+li-rj-r/ 


The above transformations give 




V4 

e K 


Hence 

I„•CM/^J 


}/< 

e 







Johnson's Diifferential Equations, p. 235. 



A. L. OTOOLE 


9 


Settingr 2>--0 in ^ we find 3=^rC-^Jr’(' -^ ) 


Setting <3 in 


c^jT^ 

dd 


we find A - 


^ Jr rij 


Putting in these values iox A aiul^ we find finally 

lo - ^ ^ 

( 2 ) 

j]- 

It is worth noting, for purposes of computation, that the ex- 
pression (2) converges much more rapidly than the form (1) 
given above/ on account of the factoring out In addi- 

tion the series in (2) have the advantage that the powers increase » 
by 4 instead of by 2 as in (1). It will be shown presently that 
ordinarily is less than unity. But even ior it will not be 
necessary to go further than the terms involving ^ to get at 
least seven decimal places of accuracy. For b less than one even 
fewer terms will suffice for this degree of accuracy. 


*The form (1) is obtained immediately if we write 

f ’ 

assume term by term Integnration perrmssible, and make use of the fact 


r ^ jn -;cr^ f A £^llL ) 

already mentioned that / Cl(^ ^ ^ 


'£±f 



Id 


Cl'Rl'ILS ton MIITHOI) OF MOMENTS 


From the point of view of the Ricatti differential eqiuititm n 


can he shown that . 


« oa 

e' 




is the solution of 


db^ 


-eb 


di> 


lo -O 


when the solution in 


the form of a dehnite integral,’' Imr the diffeiential ecjiintiun 
D and and ^ are polynonlial^ 
in S with constant coefficients is satisfied b} 


,.cfl 

where <r is a constant, 775^ is the reciprocal of and <<■ and 
fi are so chosen that for all values of d 

Jo( 


Let /d k. 


Then 


//drtj rrtjcbb- -jrf- Jo^O^ 


and 
Hence 


€ ^ 




» (9 for all values of if oc^y^/d^r^. 


v.c 






■»A R. Fors>th’q Differential Equations, 6th edition, 1929, pp. 277-280* 



A. L, OrOOLE 


11 


C 






Now let c=-\f^ . Then 



An idea of the variation ofZ as a function of ^ can l)e ob- 

0 

tained from the following table calculated from (1) for values 
of b at intervals of 0,1 from 0 to 1 and using 


The results are plotted in the accompanying 

graph, Fig. 1. 



^0 

00 

1.812 805 

0.1 

1.945 063 ' 

0.2 

2.099 726 

03 

2.282 225 

0.4 

2.499 648 

0.5 

^761 349 

0.6 

3.079 783 

0.7 

3.471 748 

0.8 

3.960 152 

0.9 

4.576 578 

bO i 

5.365 XS8 


The modes are 2 X t , Ordinarily the ordinate at the 
modes will not be greater than times the ordinate at 

. Hence ordinarily it will not be necessary to consider 
values oi b greater than unity. 



12 


aiRVES FOR METHOD OF MOMENTS 


\ 



FIG. I 




■/ 


« -Cx*-26x^). ^ 

xe cix=0. 












'(S' 


<s6f -(5 


^4 








r’ ' «0^ 



A. L. OTOOLE 


13 


To find these derivatives one might use the relations® 


But term by term differentiation is permissible and for this pur- 
pose it is simpler to use (1) rather than (2). We find 




, 13-9-5b^ 




,,3b^ 73b^1f-7'-3b'^ 




3b>i9-5i>i 
'‘*sJ s/ 


^rfi) 


i-h 


4/ 




J/ 4/ 



Since b iO hence A and all its derivatives are greater than zero. 
Now the total probability is to be unity hence take k’^-j 


9^. 


~ I ‘ 




Type III : 



dl, 

m, 


■^3 =c9. 



dX 

db^ , 
<1^ 


etc» 


y* ke 


4cxj 


• Whittaker and Watson, Modern Analysis, third edition, p. 360.- 



14 


CURVES FOR METHOD OP MOMENTS 


I'his curve is not symmetrical. But, obviously, changing c 
to -- c has the same effect as changing ai to or simply revers- 
ing the shape of the curve and the distribution from which it 
arises Hence it will be necessary to consider only positive values 
of r . As stated already, it is easy to show that there is a real 
mode at the point given by z equal to the real cube root of c . 
If c*/ then , that \s e ^ times the value of the ordinate 

at . Hence usually C will not be as great as unity. 



' FIG. n 

{}) fy) 

Let 

- 00 

Integrate by parts letting , <yi^=e cz'/tr. 

Then 


I. (rrooLE 


i-i 


I IcllCC 







-m'j. 


W lUi iliih vjiliu' t)i c cak'ulaieil from thi.' .t;nc:ii data mechaii- 
,ii.al quadrature tan be u-eil to find an approximate value foi 
I'hen let 


4 -- 


I 

4 


Tilt' result tu the inieyiatioii In paits could haw Ijeeu written 
lu the form of the dilNreiitial eqiuiiion 


ci% 




e o 

(Iclnnrc uuc|;Kil iM* ihe .solution tjf tlie clilfciential cqtuUion 


L It i^- L't\>y to '^liow that /=/<?''" i,s the 


d'C ^ 


cy. 



I'nr. hcie 


iUlcl 




/& 

-^foi all values utV if 


C<, 


i fence 



16 


CVRVES FOR METHOD OF MOMENTS 


1 /- 


eu9 


Now let and c^-iS * Then 



e czx 




An expression for the value of can be obtained either by 
hading the series solution of the differential equation and deter- 
mining the constants by setting c^O in and its derivatives, or 
by expanding in series in the definite integral itself and 
then integrating term by term. 


e Wx + / e 


r°^~Cx‘^-/-4-c.x) 
e <^x+ 


r'^.('x'^-4cxj^ 
e c/x 





'- 4 .CX 






e cos/7 ^ 



{4cx)\ ^4cx) 

^ 41 $/ 


6 

/ 



c/x 



L. O'TOOLR 


17 


CO 

=A?Z’ 

n-o 


‘?ri 


r^cj 

C^> 7 JJ 




-X 


4 

dx 


" 2 n.o r^»)/ ^ 


Ar^-) 


1 (Asd-i- 

^^ 4 . 4 / 4 ^- 6 / 




-f. ; V- 


^ 6 / 


^ tUAsld 

^ 'f- ^-l^f 

4 ^'JOi 


These series may be differentiated term by term to obtain the 
derivatives of and hence 


r / -('x^- -fcxJ / a!I„ 

L- / xe 

' / •4- a! c 


' -<w 

/ 3 -('x'^'fcxj 

JrJ 5 j 4->, 

- 04 > 

T / -('X^- 'fcx) .1 J 4 

07x^4) 



jg CURVES FOR METHOD OF MOMENTS 

\ix is replaced hyx-^C the effect is to translate the i-ooJa) 
value ofz to the origin. The equation of the curve then beu.mes 

y-r 


where 


Type IV; 


= -4 . 

Q = 6 

Cj= -Sc, 


c^= 5c/c 


y=ke r 




Consider the definite integral 




If get Type I, ^rC?we get Type II. If p-O^ 

qyO we get Type III. Hence consider now p / O, 

Integrate I by parts with u=e ^ and dv^e «« Then 




‘'-OO 


9 , 


xe 


dx. 





A, L O^TOOLE 


19 


Now divide by and multiply by ^ , Then 
$*= - 

liegiii again with and integrate by parts, this time with 


e and 


. Then 




-('x gx) 




-00 








c/z 




Divide by . Then 


/= ^ ^ PjU^ i- g/^J . 


Now substitute '\x \ and we get 






f -a! 



( 3 ) 





CURVES FOR METHOD OF MOMENTS 


The result of the two integrations by parts can be written in 
the form of two simultaneous partial differential equations. They 
are 








dp dg dq 


r n 

^ dp^ dp ^ ^ 


Let ^ / e af^, 

- CO 


Then 


“09 





-C^'^p./c^J 


aitf^ O. 







- (Ut'*-*’ px^J -qx 

e e ^ c/x 


•Off 


tx 


/f 







j/ 



dx 



A. L, OrOOLE 


Zi 


’Lhui 










r-/y 


C^/7 ^ 


^ ^Z4 




^^T74-l 


^ 'V 


- r-/y ^ 

<^9 


o*> ^ 

'£) TTTT)! 


When the values of p and are calculated from the data 
of any given problem by the formulas (3) then values for , 
I^ , . etc. can ht obtained by mechanical quadrature. 

For two real, distinct modes < Oj . Hence if 

^X<p<0^tn then one mode flattens 

forming a point of inflexion with a horizontal tangent at the min- 
imum point. Changing 9 ^ to has the same effect as changing 
X to^-.j!r and hence 9 is a component of skewness of the curve. 
If the curve is so placed that the sum of the values of x at the 



,22 


CURFES FOR METHOD OF MOMENTS 


modes and at the minimum ix)int is zero then the equation of the 
curve will be of the form 


if now we change the scale of ^ by replacing by x/a then 
\\t are led to the functions of the form 


rerforniing the two integrations by parts, as before, on the 
integral . , 


^00 


I-/e 

-o» 


- d %p:c gxj 


dx 


leads to the relations 




If 




then 


L <s ^^9). 

In i>articular, 



j. (vrooiM 


23 






/~T ^ ^ I 

4 4 ' 


Mi^ ^1 

I /I ^ 

><'^- 4/4 ='4 ' 

~ ■ 4 j 




L- 

4 '^o 




^ -z /_r - i rrz^v 


ill iIk' case of T)pe 1 wlici 

-a^x'* ^ 1 /-'/I ) 

y=y.e ^ 

P’^'^’O and hence from (4y or as can be shown directly, «s^»-^, 

‘*P 4 


In T>pe II, 

-dVx'^y^/S'xV y ox^ J , 

y^y^e . ~= / e “ 

, X) '^aj 



24 


CURVES FOR METHOD OF MQMF:RTS 




and hence 



2 PjUji ^ -4/^4 


In the Type III where 









and hence 






In general, since/? and are determined when the modes and 
minimum point of the curve are known, theoretically at least, *3*^ 
is fixed by the relations (4). In practice, however, this? would 
mean that the accuracy in the determination of df'^vvould be con- 
tingent upon the accuracy with which the modes and minimum 
point are determined. Hence other methods for fixing will be 
required in general. Now if in <?‘*'^we replace p and cjf 

by (4) which involve only and (Juantities calculable from the 
given data we have a function of a alone, say f^a). It will be 
sufficient then if we determine a value of <2 such that f/csj^ N 
where N is the total given frequency Then fix p and <g by (4) 
and the modes and minimum point by 2px / ^ ^ d?, 


The points of inflexion are found from the equation 




-O 


and for Type I are given by • Hence 

^ t a 9JC6<P S 

ys 

approximately. For Type II they are given hy 
^ O . For Type III they are given 
^ O , And in general they are given by roots of the 

equation 





A. L orooiM 


25 


It will be noticed that the distribution given by 

can have the Mean at the origin if and only if , that is. 
if and only if the distribution is symmetrical. Xow replace x by 
X-fn, The area remains the same and hence also v/ . 

The equation then is 

where 

\ 

&ndp and ^ are given by the relations (4) above. An integra- 
tion by parts with u » shows that 





TYPE I 


26 




A. L OTOOIM 


27 


This small beginning of the study of the system of frequency 
curves with the quartic exponent will be concluded here with the 
construction of artificial illubtrations of I'yp^s 1 and !l. 

TyPK T 


X 

y 



0.0 

0.5516313 

1 0,0000000 

0.0000000 

O.l 

,5515762 

.0055158 

,0000551 

0.2 

.5507494 

0220300 

.0008812 

0.3 

5471811 

, .0492463 

.0044322 

04 

5376888 

0860302 

.0137648 

0.5 

.5182096 

.1295524 i 

.0323881 

0.6 

.4845787 

.1744483 

.0628014 

07 

.4338852 

.2126037 

.1041758 

0.8 

.3662367 

.2343915 

1500105 

0,9 

.2862255 

.2318426 

.1877925 

1.0 ! 

.2029338 

.2029338 f 

,2029338 

1.1 1 

.1275846 

.1543774 

.1867966 

1.2 

.0693579 

,0998754 

.1438205 

1.3 

.0317147 

.0535978 

.0905803 

1,4 

.0118376 

,0232017 

.0454753 

15 

,0034917 

.0078563 

.0176767 

1.6 

.0007861 

,0020124 

,.0051518 

17 

.0001301 

.0003760 

.0010866 

18 

0000152 

,0000492 

0001S96 

' 1.9 

,0000012 

0000043 

.0000156 

2.0 

.0000001 

.0000004 

.0000016 


S.27S815S 

1.68994SS 

1 

1.2SOOOOO 


Toil frequency ~ ‘-J.OOOOOOO 




JS0.6a99'}-55j- o. ooocooo 

'' iO 

0(D(X>(X>C> 


0.3379891 

^ 0 .^ 500000 . 


JO 










TYPE II 


28 







/.. O^TOOLH 


29 


zidro99 " 


TYPE II 



y 

f "■ - ■■ 

TC'^y 

0.0 

0.4572267 

0.0000000 

0.0000000 

0.1 

.4594725 

.0045947 

,0000459 


.4657175 

.0186287 

.0007451 

0.3 * 

.4744135 

.0426972 

.0038427 

0.4 

4827888 

.0772462 

.0123594 

0.5 

.4867153 

,1316788 

0304197 

0.6 

.4808614 

,1731101 

.0623196 

0.7 

.4594725 

2251415 

.1103193 

0.8 

.4180410 ! 

.2675462 

.1712296 

0.9 

,,1556970 

.2881146 

,2333728 

1.0 

.2773220 1 

.2773220 

.2773220 

1 1 

.1936552 1 

.2343228 

2835306 

1.2 

.1181056 

.1700721 1 

2449038 

1.3 

0611958 

1034209 

1747813 

1.4 

,0201429 

.0512401 

.umaoo 

l.S 

.0089145 

.0200576 

.0451297 

1,6 1 

. 002 . 343,1 

.0059988 

.0153571 

1.7 

.0004575 

0013222 

.0038211 

1.8 1 

.0000638 

.0002067 

0006697 

1.9 i 

.0000061 1 

.0000220 

. 0 ( K10795 

2.0 ; 

,0000004 

.0000016 

.0000064 

1 

1 

5 . 22K6133 

2 0827448 

1 7706859 


Total frequency J^:B^iq>-^^Hi^Llooaooo, 


M, = 




2{'Z03£7448) - O. OOOOOOO 
2('J.77a6a59j-o> 0(XkX>ao 


a416S^Q96 


0.854J3rW, 


JO 

From relations ( 1 ) or (2) it is found that when b - 0.7 ^ , 


p = -0.5,<i-0) 

then « 7.187099: Conversely, the formula t>=: — - 

gives, I'etaining six decimal places, b ~ Q. 750000. 

(To be Continued in May Ishue) 














ON THE LOGARITHMIC FREQUENCY DISTRIBU- 
TION AND THE SEMLLOGARITHMIC 
CORRELATION SURFACE^ 

By 

Pak-Tsi Yuw 


INTRODUCTION’^* 

The method of treating frequency curves as developed chiefly 
by Kdgeworth, Kapteyn, Van Uven and Wicksell occupies an 
iin])ortaut place in both theoretical and applied statistics. The 
essence of this method may be briefly summarized as follows. 
Suppose a function of the variable ss is distributed according 
to the noimal law of error. Then, z certainly cannot be also 
normalU distrilmted, unless the function is a linear function of . 
Without losing generality, we shall write the normally distributed 
function in standard units as ThUvS the origin of is 

its mean and the unit of x is its standard cleviatfon. The relative 
frequency of values of x between x and dx is, therefore 



and the relative frequency of values of 

M 


X Ixitween z 






Thus if we have an observed frequency distribution of Sf 
and we know a normally distributed function of z * then we can 

*A Dissertation Submitted in Partial Fulfillment of the Requirements 
for the Degree of Doctor of Philosophy in the University of Michigan. 

* * Papers writteq by the writers mentioned in this introduction are 
listed under the writers' names in the Bibliography at the end ai this paper. 



PAB-TSI Yl'AX 


31 


graduate the dislributiun of Z hy UMng this formula. lulgeworth 
calls thi.s method of graduating a frequency distribution the 
method of translation. In two papers on “Skew Frequency Curves 
in Biology and Statistics” publi.shcd in 1903 and ]9U)» J. C. 
Kapteyn elegantly set forth a tlieoretica! foundation of this 
method. Later Wicksell gave a siinilar justification. Both of 
them based their “genetic theory of frequency”, to use Wicksell’l 
terminology, upon a generalized hy|)otht*sis of elementary errors, 

In the present ^>aper, we are interested only in the important 
special case where 'I'lie frequency function of 

Z , then, becomes ; ' 

which is called the logarithmic frequency function.* 

Numerous papers have been written on this {rec[ucncy curve. 
Among the early writers were Francis Galton and McAllister, 
But a s>stematic treatment on the properties of this curve from 
the standpoint of mathematical statistics is still lacking. Hence, 
in the fiist part of this paper, such a treatment will be given, thus 
leading to some interesting relationslups among the characteristics 
of this curve. 

Various methods of determining the parameters of this fie- 
quency function have been proposed by writers this su])ject. 
Pearson is the first writer to make use of the method of moments. 
Later this method was also applied by Jj^rgensen and Wicksell. 
In this paper, the method of moments will be considered and a 
table will be provided to facilitate the computation of the constants 
by this method. 

Edgeworth, Wicksell and Van Uven all have contributed in 

* For a jnstificatioiv o( this fretjutiiKy function on Webtr- 

Fechatr .s P^^ychophysrcal Law see the “Calculas of Observations*’ hy K. T. 
Whittaker and G Robinson, pp. 217-218. (Blackie & Son I. id, l.ondon 
and Glasgow, 1929) 



logarithmic FREQUBNCy DISTRIBUTION 


32 

extending the method of tninslation to correlation surfaces. Wick- 
sell’s logarithmic correlation surface i.s particularly noteworthy. 
In the last part of this jiaper, a semi-logarithmic correlation sur- 
face of two variables will be develoiied and its projierties studied. 

The writer wishes to express his appieciatioii for the a.ssist- 
ance I’rufessor Cecil C. Craig ha.s given him in making thi.s study. 


PARTI 

TIIK UKIARITHMIC FREOUENCV DISTKIIIUTIOX 
For the sake of clarity, it is desirable to state at the. outset 
that the logarithmic frequency distribution represented by 


i 




jl 






( 1 ) 


is imimodal and has three parameteis. The parameter a is the 
finite lower or upper limit of z according to whether i> is posi- 
tive or negative. In the following discussions, unless the sign 
of b plays an important role, we rihall take b to be positive 
and a to be the finite lower limit of z . However, the results 
of our discussions can be easily modified to cover the case where 
b is negative and a is the finite upper limit of z . 

In the first eight sections of Part I the properties of the 
logarithmic frequency distribution will be treated from the stand- 
point of mathematical statistics,* and in section 9 the numerical 
application of this distribution will be discussed. 

1. AVBRAGES 

We shall first; give the analytic expressions of four different 
averages of Z and then observe their relative magnitudes. 

* Some topics under consideration here in regard to the properties of 
the logarithmic frequency distribution have also been discussed by many 
writers, among whom we may particularly rnention McAllister, Kapteyn, 
Pearjion and Pretonus. See the references under these writers names in 
the Bibliogi apliy of this paper. 



PAB-rsi WAN 


33 


liy definition, the arithmetic mean of ^ is 
f zF^x)dz - he ^ * <3 . 

The logarithm of the geometric mean of x about the point 
;ar» A is given by 

Hence, the geometric mean of Z about z ^ 0 measured from 

X^O is 


r]r7g^ « , 


Since the median of z corresponds to ^ O 

it is equal to 


« d'&dt . 


dF*^Js) 

Setting the derivative F/^z) 

Ctz 


equal to zero, we obtain the mode of z 


A 

777^ ^ oe 


Thus, the geometric mean and the median are equal. More- 
over, 

2 . POINTS OF INFlBCriON 

The secopd derivative of 

ctst^ c? ^ • 



34 


logarithmic frbqubhcy distribution 


The roots of the equation 

3c^ Jogr 


are the points of inflection of the logarithmic frequency curve. 
W*e shall denote them by an<l , 


Zj, • he 





Note that the quantity under the radical sign is always jxisi- 
tive and greater than one. Its square root is, therefore, greater 
than one in absolute value. Hence, . That is, the 

geometric mean and the median of ss lie between the points of 
inflection. 

Furthermore, if we observe that the points of inflection may 
be written in relation to the mode as 








we see that Zf < ' 

But the mean does not always lie between the two inflection 
points^ since 







PAB-TSI WAN 


35 


Obviously, is always less than the mean. But when C^>4/7, 
(he mean is situated above both points of inflection. 

Now, the relation of the averages and the points of inflection, 
when c^<4/7, may he expressed by the inequality 

^ < rr? < 


which holds for almost all piactical cases, since c'^ rarely exceeds 
4/ 7 in practice. 


3. lUGU CONTACT 

X frequency function is said to have high contact, if the 
function and all its dcnvati\es vanish at the upper and the lower 
Imiitb of the variable x . We know that the logarithmic f requeue) 
function vanishes at both the finite and the infinite limits of z . 
It can he easily seen that all its derivatives also vanish at these 


points, if \vc make the substitution - J?: « which will 

throw every derivative of the logarithmic frequency function into 
a product of two factors, one being a (X)Iynomial in z ' and an- 

other being e where > is a jxisitive integer. Thus, 

it is obvious that all the derivatives become zero, as approaches 
^ ^ , which correspond to the finite and the infinite limits of z . 
For instance, this substitution will put the first derivative of the 
logarithmic frequency function 





into the form 


t 




which clearly goes to zero as approaches t oo , that is, as % 



36 LOGARITHMIC FREQUENCY DISTRIBUTION 
approaches and infinity. 

The logarithmic frequency function, therefore, has high 
contact. 

4. MOMENTS 

We shall study the practical application of the method of 
moments to determine the parameters of the logarithmic frequency 
distribution in section 9. But at present we must know the rela- 
tionships between the , parameters and the moments in order to 
discuss the proi)erties of dispersion, skewness and kurtosis 

First, vvc shall express the moments in terms of the param- 
cters • 

The^i'-M moment n\ about tlie iwint ss«c5f i« given by 

/// fcx'S) . 

'ii 

And we also have the recurring relation jU^ • ^ ‘ 

The moment of 5^ about the mean is 

« . * 

/U ^ 

Consequently, the standard moment of 

Setting ^ equal to 3 and 4, we have 

% - S0 ^ It <5 y 



PAE-rsi yvA.\ 


37 


which will be di.^cubsed in connection with skewness and kurtosis. 

Note that the sign of follow^ that of ^ , because the sign of 

the third moment of z about the mean is determined by S . 

Now, we want to exjiress the parameters in terms of the 

moments. It is clear that there is an infinite number of ways to 

accomplish this, since there is an infinitude of moments. But we 

are particularly interested in the expressions of the parameters 

in terms of the mean and the second and third moments about 

£ 

the mean. Ivetting ^ . we. have 

yUj, - cu -J ) 

( 2 ) 

-- 

Solving tliesc equations foi the parameters, \\c hud o} is the onl\ 
real root of the culac. 


CO vcf J ^ O 


Hence, the jiarameters c , b and a may he e\]iressed as 





where the sign of b follows that of and The prac- 

tical application of (3) and (4) will be discusvsed in section 9. 
We shall now turn our attention to some other properties of the 
logarithmic frequency distribution. 

5. DISPERSION 

The dispersion of x about the mean may be measured by 

4 

the standard deviation, ^ be Denote the 



LOGARITHMIC PRilQUBNCV DISTRIBUTION 


.^8 

deviation of yt from, the mean m terms of the standard deviation 
by (^z- 7 n)/cX , Then, with the aid of formulae (1), (2) 
and (4). we obtain the distribution of i as 


/in c\j-f-('e 



whcie 




takes the same sign as , 


W’c know that tor the normal distribution 50% of the total fre- 
quency lies between the limits 6T^5 and &7^5 , 

Xnw, we want to know the si mil ai limits of i for the logarithmic 
di'itributJon Foi that reason, we write i directly in terms of 

the normally chsinlnited function 

C D 






Ce 


re^^-ip 


( 6 ) 


Placing X equal to have at once the limits 

5 

JJI 

, _ f'e 

Z* * *' — ' ' " — — 

between which 50% ot the total frequency is included, These 
limits are two quartiles and obviously depend on c ■ It is clear 
that one can also locate other deciles and percentiles of l* by using 
( 6 ). 

An alistract measure of the dispersion is the coefficient of 
variability which expresses the' standard deviation in terms of the 



PAH^TSl YUAN 


39 


mean. For the logarithmic distribution, it is 




a 


^ A 

re^ 


f7) 


which shows that in a Jogarithiriic distribution the larger cT^is, 
the greater is the variability. 

It is interesting to note that if we also express the deviation 
of ^ from the mean in terms of denote it by 




a 




we ha\c by (5) the distnhution of ^ m tins simple tonu 


J 



( 8 ) 


6 SKTUrVFSS 

It lias been ])ro])()sc(l to Use e(j /-^ ot "tfy as a measure of 
skewness of a frequency distrihiition, Foi tlie logarithmic curve, 
we have shown that 

=r/e> JSj 

( 9 ) 

/ / 

{>r , C" fj^ 


Hence, the absolute value of increases with c. Since c can 
take on any finite value whatever, the skewness of the logarithmic 
curve a.s measured by can also have any finite value. More- 
over, as we have seen, of the logarithmic distribution can be 

positive as well as negative. 

In Figure 1 are shown four logarithmic curves with nT**0 , 
cr^*/ and with varying s Various parameters calculated 
from formulae (4) and im])ortant characteristics of these curves 
are exhibited in Table I. 



40 


LOGARITHMIC FREQUENCY DISTRIBUTION^ 


When c^O .oc^’dho vanishes. In fact, the logarithmic curve 
approaches the normal curve of error, as c goes to zero. This 
can be demonstrated as follows With the aid of formulae (4) 

we can write the normally distributed function ^ as 









~ y. . . . 

^ z a c ^ c 


Now, it can be easily seen that 

Jim z a 5l£? 

C’*0 <f 

which is a linear function of X . Hence, the logarithmic distribu- 
tion of X approaches the normal distribution as c approaches 
zero. 


TABLE I 

Parameters and Impoitant Characteristics of the Logarithmic Curves 
with and Specified 



2 

1 

4 

-4 

<o 

1.0044 

1.1038 

2.0000 


o 

.0665 

,3143 

,8326 

,8326 

a 

-15.0222 

-3.1038 

-1.0000 

1.0000 

b 

1 14.9890 

1 2.9543 ' 

7071 

- .7071 


- .0332 

-.1495 

- ,2929 

.2929 

mt 

- .0991 

- .4274 

- .6465 

.6465 


-1.09 

-130 

- 9341 

.9341 


.90 

.50 

- .0532 

.0532 

Z> 

.0666 

.3221 

l.OOOO 

1,0000 

ot^-J 

.0712 

1.829S 

35.0000 

35.0000 






PAB-TSI YUAN 


41 







42 


logarithmic FRBQVUNCy DISTRIBOTJOH 
Another measure of skewness is defined by Pearson as 




m-rn^ 

<T 


For the logarithmic curve, it becomes 


^ ^ 

which has a maximum value equal to . 6S6J s when co^ 1.7^0^ 
and J57S This, however, does not indicate that the skew- 

ness of the logarithmic curve is limited. Rather it shows that % 
is>not a satisfactory measure of skewness, so far as the logarith- 
mic curve IS concerned For any measure of skewness should 
characterize the skewness of a curve without ambiguity, and X 
fails to do so in case of the logarithmic curve. For instance, 
when we say that a certain logarithmic curve ha^ we 

may mean either a logarithmic curve with d<9or one with ' 

36 . 00 , 

When the logarithmic curve Is only moderately skew. X ap- 
proximately equals This can be shown as follows* Letting 

h -i, w^e have ® 

and c 3hi^h 

Hence, tor small \/7\ and hence small | X approximately 
equals /?. For instance, when . Jt' « C^^J^/wduch 

is approximately 

We may mention here that for the Pearsouiau type 111 curve, 
the relation always holds. In fact, it appears from 

Table II that the type III curve and the logarithmic curve are 
very similar for small But the differences between them are 
already pronounced for , as we can see from Table IIL 



YU, IK 


43 


TAfil,!'. II 

OrdinaU's and Anas ui the i.ogaritlintic Cnrvc and the 


IVnr'-nmnn III t'iir\e 
//?*£> ci ^=>.2 


z 

' 1 

OrdiniUenl % 

l.oy 1 111 ve 'r>i)c HI 

Aica from the Lower 
Liitiit to jjr 

Log Curve T} pel II 

- 3 5 

,0003 

.a(Ki2 


.000(1 

- M) 

(1020 

0(120 

(1005 

0004 

- 

OIJ-I 

0123 i 

(HU4 

0034 

- JO 

,()4'0 

(M'^2 

0172 

0171 

- 1 5 

.lu7 

,1341 

0607 

.0607 

- ] 0 

2587 

2501 

.I57</ 

1.5H2 

5 

,W2 

m? 

,.11 7H 

,3172 

fj 

,m\ 

mil 

51,32 

.5133 

.5 


,33(h! 

,7006 

.7002 

J 0 

' .2267 

1 

2267 

.84)8 

^8417 

1 5 

1242 

1245 

.9285 

9284 

2.0 

,om 

0567 

.9720 

.9721 

2 5 

0217 

.0217 

,m) 

9006 

3.0 

11072 

0071 

9972 

907.5 

vl5 

0020 

,0020 

9991 

,9W3 

4.0 

0(KV. 

,(')(KI5 

.99^;8 

,9998 

iS 

0002 

.0001 












44 


LOGARITHMIC PREQUHNCV DISTRIBUTION 


TABU III 

Ordinates and Areas of the Logarithmic Curve and the 


Pearsoman Tvpe III Curve, 
7n*0 0^1 ojj! g 2 


z 

! Ordinate at Z 

\ 

1 Log, Curve Type HI 

1 

1 

Area from 

Limit 

Log. Curve 

the Lower 

to Z 

Type III 

'2.0 

.0084 

0 

.0009 

0 

- IS 

.11% 

1226 

.0259 

.0190 

-1.0 

.3.364 

3600 

.1398 

.1429 

- ,5 

4408 

Am 

3442 

.3528 

0 

4040 

.3907 

.5624 

.5665 

,5 


,2807 

,7303 

.7345 

1.0 

1/91 

1785 

.8520 

.8488 

IS 

.1017 

.1043 


.9182 

2,0 

.0548 

0573 

.9590 

.9576 

* 2,S 

.0295 

' ,0.300 

9783 

9788 

3.0 

.0144 

.0151 

.9895 

,9897 

3.S 

.0073 

0074 

.9948 

.9951 

4.0 

.0036 

.0035 

.9977 

.9977 

- 4,S 

,0017 

0017 

9987 

9<AH) 

5.0 

,0009 

.0008 

.9993 

,9995 

S.5 

,0004 

.0003 

,w; 

,9998 

6.0 

.0002 

.0002 

W8 

,9999 

6..S 

.OOOl 

.0001 

.9990 





PAE-rsl WAN 


45 


7. KURTOSiS 

Another important characteristic of a frequency curve is 
kurtosis measured by § or simply by ~ which 

equals zero for the normal law of error. If the mean and the 
standard deviation are taken to be the origin and the unit, respec- 
tively, then usually the frequency of a curve in the vicinity of 
the mean is in excess or in defect to that of a normal curve ac- 
cording to whether r) is positive or negative. A curve is said 
• to be platykurtic. yp > O > It is leptokurtic, if /p < O, Thus, 
the logarithmic curve is always platykurtic. for its rp is 

( 10 ) 

or ^co 6 

and u)>J Since the logarithmic curve has only three parameters, 
there exists a functional relationship between its skewness and 
kurtosis. This relationship is given through the parameter < 4 ; by 
(9) and (10). We may further deduce the following relations 
from these two equations : 

yp is always greater than J > This follows from the fact 
that ^ /y ^ ^ ^ ^ ^ 

Forj««^ |< 6.44, we have > i? , since 

3 - rp ^ ^ a) - ^ CO 6 co 6 J > O holds, provided 

. which corresponds to | < €, 44^ 

For \^s\<Zi5 , we liave .since 

■Z'K’f- ry co-I JC - CO ^co •/’ > O 

holds provided co<J*4, which corresponds to 

Since practically the value of|c<^|can hardly reach 0.44ot 
cven^/^*. the relations just stated hold for all practical instances, 
The relationship existing between yp and oc^ is sometimes 
used as a criterion for applying the logarithmic curve to observed 
data We shall discuss this jx)int in section 9 



46 


J.OCJRITHMIC PRHQUBNCY DISTRIBUTION 


8. P0n7;A\S, ROOTS AND PRODUCTS OF THE LOGA^ 
RITHMICALLV DISTRIBUTED VARIABLES 

If ]o,t:arjthmically distributed and has '’a* as its lower 
limit. so distributed. A* being any constant. 

This from the fact that if normally dis- 


tnlnited, so i.s From the frequency function of 

given by (1), we find at once the analytic expression 
of the f requeue} distribution ofW to be 


J 

/en c/riv 






( 11 ) 


We have learned from the preceding sections that a logarithmic 
distribution represented by (1) with larger c has greater varia- 
bility, skevsness, and kurtosib. Thus, if k^>l, the variability, 
skewness, and kurtosis are greater for W than for x . On' the 
other h<and. if k^< 1 , the distribution of jr has greater varia- 
bilitv, skewness, and kurtosis. 

If the logarithmically distributed variables 
are independent and have for their lower limits, 
then the ])roduct 

IS also so distributed. This follows from the fact that if 








are each normall) distributed and are independent, their sum also 
obeys the normal law of error, 

Since the variables are independent, the frequency distribu- 



PAll-TSl YUAN 


47 


tion of these n variables is represented by 




where = 


J 


^<r/rz^- - J 




Substituting - in (12) and integrating 

the resulting expression with respect to ***’/^/7 successively 

over the respective ranges, we have the distribution of as 

1 '>4-^4'^^ -^c^; 4;] (13) 

)/Jn 

Since the sum, 0^+ . ... + , is greater than any cj^ , 

the distribution of y has greater variability, skewness, and kur~ 
tosis than that of each individual variable. 

9. N[/M£jRICAL APPJJCATJONS 

Many methods of fittmg a logarithmic frequency curve to* 
observed data have been pro^xised. But only the method of mo*- 
ments will be considered below.* 

The method of moments is very simple to apply. Jt consists 
of placing the computed moments in equations (2) and then 
determining the parameters by solving these equations by formulae 
(3) and (4).t The only step of computation which require.s 
spme time and care to obtain accurate results is the solution of 

* Among other methods of graduating the logarithmic fre<iuency dis- 
tribution, the graphical method proposed by Kapteyn and Van Uven is 
especially useful. For a description of this method, refer to their paper on 
**Skew Frequency Curves in Biology and Statistics, 2nd Paper*’. 

fin his paper, ‘'On the Genetic Theory of Frequency**, Wicksell also 
showed the application of the method of moments to the logarithmic fre- 
“quency distribution. However, he found the parameter “a** first and then» 
proceeded to obtain “log b” and “c“. 



48 


LOGARITHMIC FRIIQUHNCY DlSTRlHCTiON 


the cubic. 


}lf {"cij) - cij ^ ^4 J « O 

Hence, it is desirable to have a table which will piovicle an approx- 
imation of the required root of this cubic for a ^iven 'Dien. 

the root can be approximated to as f^^reat a deg^ree of accuracy 
as we wish by applying, for instance, Newton’s method. That is 
why Table IV is constructed. Practically, after we obtain an uj)- 
proximate value of co from Table IV, one single application of 
Newtoii^s method will almost invariably suffice to give us a vfilne 
of cj accurate to tour decifnal places. In Table IV, values ul c 
corresponding to given values oi u) are also pnjvided to Nerve 
as a check to our computation of c b)^ formulae (4). 

T.\BLE L\ 


Tabic Facilitating the Solution ot the Cubic 


iO 


c 

r 

ml rii'iin- — iTT 

C 

1. 

0 

IJ 

__ 



l.Ol 

3010 

UKK) 

mEM 

1 

4807 

h02 

4271 

1407 

HbH 

1,6991 

.4889 

1.0.? 

.5248 


msM 

1.7356 

.4969 

1.04 

.6080 


1 29 

1 7717 

5046 

1.05 

.6820 

2209 

1.30 

1,8075- 

5122 

1.06 

7495-f 

.2415- 

1.31 

1.8429 

51% 

1.07 

.8122 

2602 

U2 

1 8781 

5269 

1.08 

8712 

.2775- 

1 33 

! 9129 

5340 

1.09 

.9270 

2936 

134 

1.9475+ 

.5410 

1.10 

.9803 

.3087 

1.35 

1.9819 

5478 

1,11 

1.0315- 

.3231 

1.36 

2,0U)0 

5545+ 

1,12 

1.0808 

3366 

1.37 

2 0499 

.5611 

1.13 

1.128S+ 

3496 

138 

2.0830 

,5675+ 

1.14 

1.1749 

.3619 

1,39 

2.1171 

,5738 

I. IS 

1,2200 

.3739 

1.40 

2.1503 

.5801 

1.16 

1.2640 

.3852 

1.41 

2 1835- 

.5862 

1.17 

1.3070 

,3962 

1.42 

2.2164 

.5922 , 

1.18 

1.3492 

.4068 

143 

2 2492 

.5981 

1.19 

1.3905- 

.4171 

1.44 

2,2818 

.6038 

1.20 

1.4311 

.4270 . 

1,45 

2.3143 

.6096 

1.21 

1.4710 

.4366 

1.46 

1 2.3467 

1 .6151 

122 

1.5103 

.4460 

1.47 

2.3789 

.6207 

123 

1.5491 

.4550 

1,48 

! 2.4110 

.6261 

1.24 

158/ 

.4638 

1.49 

' 2.4430 

.6315+ 

125 

1.6250 

.4723 

1.50 

2.4749 

1 .6368 








PAB-TSl YUAN 


49 


To illustrate the use of Table IV and to help in studying the 
application of the logarithmic frequency curve, we take the dis- 
tribution of the weights of 1,000 female students from the 
^‘Synopsis of Elementary Mathematical Statistics’’^ bv Miss B. L. 
Shook. (See Table V.) 

The mean, standard deviation, and skewness for this distri- 
butionf are 


//d. 74 
J6 91 7^^ lbs. 
. 976424 


To compute we find from Table IV that for .^764Z4 
CO is approximately cO^ * IJO^ Eor a better approximation, we 
apply Newton’s method * 







\ 


3 -1* 

^ O 




.a>^S96 

xa^d 


X. XO~ 000743 


- X. 099 -esz 


iy formulae (4), the parameters C , b and a are fouiul to be 

O- . 307 677 
6^ JX. 2X60 Xbs. 

<f * 6X. 0423 Xb3. 


* Annab of Mathematical Statistics, Vol, I, No, 1 (1930), p. 39. 
t Sheppard’s corrections have been duly applied. 



so 


logarithmic frequency distribution 


TABLE V 

Observed and Theoretical Distributions oi the Weights of 
1,000 Female Students 

(Original Measurements Made to Nearcbt l/lO lb>) 


Class 

Limits 
( Pounds) 

Observed 

Frequency 

Theoreticxi! 

Logarithmic Distribution 

Theoretical 
Type III 
Distribution 
Bv Areas 

By Areas 

By Ordinates 

70- 79.9 

2 

0 

0 

0 

80- 89.9 

1 16 

10 

! 6 

4 

90- 99.9 

82 

97 

94 

102 

100-109.9 

231 ' 

228 

! 234 

288 

110-119.9 

248 ! 

255 

259 

[ 250 

120-129.9 

196 

190 

190 

184 

130-1399 i 

122 1 

114 

111 

1 111 

140-149.9 

63 

57 

57 

59 

150-159.9 

23 

27 1 

27 

29 

160-169.9 

5 

12 i 

12 

13 

170-179.9 

7 

6 

6 

6 

180489.9 

I 

2 

2 

3 

190-199.9 

2 

1 

1 

1 

200-209.9 

1 

1 . : 

1 

G 

210-219-9 

1 

0 

0 

1 0 

Total 

LOGO 

LOGO 

. LOGO 

1 LOGO 


Knowing c , b and a . we obtain the geometric mean and the 
mode : 

^ 25S3 

^ JJJ. 6JJd6 lbs. 

Using these parameters, the theoretical distribution of the 
weights of 1,0(X) female students has been computed and is shown 
in Table V and Figure II. The fit of the logarithmic distribution 
to the observed data is, indeed, excellent.'*' The lowest possible 
weight of female students, according to the theoretical distribu- 
tion, is 65.04 pounds, which is just about what one would expect 
after examining the observed data. 

Miss Shook! the type III distribution to fit the same 
set of observed data /and gave the result as shown in the last coU 


* Grouping the first three classes into one class and the last six classes 
into one class, we apply the X^test for goodness of fit and find that the 
probability to get a worse fit is .70. 

t Annals of Mathematical Statistics. Vol, I. No. 3 (1930), p 242. 






PAB-TSI YUAN 



22C 





52 


LOGARITHMIC PREQUBNCV DISTRIBUTION 


umn of Table V. The fit is not as good as that given by the log- 
arithmic distribution, especially in view of the fact that the type 
III curve fixes the least possible weight at 84.09 pounds, while 
as a matter of fact there are two students whose weights are 
below that HmitJ 

From the standpoint of the method of moments, a criterion 
for the logarithmic distribution to fit a set of observed data is 
that 9 « ^ computed directly from the observed data must 

be approximately the same as the theoretical 9 computed from 
formula ( 10 ). This criterion, however, does not seem to work 
in practice. For instance, for the distribution of the weights of 
1,000 female students, the theoretical *9 is 1.7419, while the ob- 
served 9 IS 2.4536. But in spite of this fact, the observed distri- 
bution, as we have seen, is very satisfactorily fitted by a loga- 
rithmic distribution. 

Another criterion is to require the observed moments about 
the lower limit 'W to satisfy approximately the recurring relation 

for 4 , This criterion is approximately fulfilled by the distri- 
bution of the weights of 1,000 female students, for which we have 
‘ 147^79 'lO^ 

- J46696 

and 

The fact that a set of observed data may be satisfactorily 
graduated by the logarithmic distribution but fulfills only the 
gggpnd crite rion may be explained on the ground that the com- 

tln fact, since the finite limit of the variable for type III curve is 
and for the logarithmic curves w- the finite limit 

\s al\Nays greater in absolute value for tlic logantbintc cur\e than for the 
type III curve 



PAE-TSI YUAN 


S3 


paratively wide discrepancy between the obbcrved and theoretical 
frequency in the classes near the lower limit niaks a great differ- 
ence in the fourth moment about tlie mean but docs not make 
much difference in the fourth moment about the point fe"* . 


PART II 

THE vSEMI-LOGARJTHMIC CORREI.ATION SURFACE 

Supix)se that the correlation surface of the functions, x « f(Uj v) 
and is a normal correlation surface and each has its 

mean as the origin and its standard deviation as the unit. Then, 
the probabilit}^ that values of x will lie l>etween x and x^dx 
and values of y between y and y -^<2^ is 

^{k,y)dxdy. ^ ^ d^cdy (1) 


It follows that the probability that values of u will lie between 
U and Ui'du and values of v between v and vi^dv is Ffu,v) 
dudv given by 




i 






2n/Fr^ 


du dv' 

$£ 

ou oV 


dudv. 


( 2 ) 


Ffa^ v) is, therefore, a generalized correlation surface of two 
variables, deduced by extending the method of translation for 
treating frequency distributions of one variable. 

It fs clear that in this general form the correlation surface 
represented by is of little practical use, on account of its 

complexity. Now a natural simplification suggests itself. That 
is to take x as a function of u only and y as a function of >/ 

h 



54 


logarithmic 1-KEQUiiNa DISTRIBUTION 


on 


ly. Jiy virtue oi this simplihcation, becomes 




'/rrSP' 


_1 






^ ^ (3) 

du dv' 


winch IS a great deal easier to handle than before. 

Professor Wicksell has made use of (3) tor the si>ccial case 
where, in our notations. and y are 





U-<3f 

4 



which leads to the so-called “logarithmic ct^rrelation surface"' * 
The surface possessCvS the property that its marginal distributions 
as w^ell as the distributions of for given values of and dis- 
tributions of foi given values of u are all logarit tunic fre- 
quency distributions. 

Presently we shall study another case tor whicli 



The correlation surface given by (3) then becomes: 




^rr A c Cv- <aj/I^ ^ 


w 


♦In Wicksell’s paper, “On the Genetic Theory of Frequency’', the 
theory of the logarithmic correlation function is developed. In his two 
:{ucce$sive papers quoted in the Bibliography of this paper, the original 
theory is extended and the application of the extended results illustrated. 



PAB-TSl YUAN 


SS 


which may be appropriately called a semi-logarithitiic correlation 
surface. We shall investigate its marginal distributions, moments 
and regression curves of the characteristics. 

1 . MARGINAL DISTRIBUTIONS 

Now, we shall first find the distribution of the marginal totals 
of u . This can be, of course, accomplished very easily by inte- 
greLting with respect to over the range from a to 

infinity. The result is : 

/ l^^u. v)<^v= e (5) 


Thus, the marginal distribution of u obeys the normal laws of 
error. 

“Similarly, if we integrate y'J with respect to u over the 
range from ^ oo to cw , we find at once the marginal distribution 
of as follows: 


J yJdi/= 


J 




Oo^ 






( 6 ) 


which is, clearly a logarithmic distribution and. therefore, has all 
the properties and characteristics discussed in I*art I. Hence, the 
semi-logarithmic correlation surface is characterized by the fact 
that one marginal distribution is normal, while the other is loga- 
rithmic. It is needless to mention that this does not constitute a 
sufficient condition for a correlation surface to be a semMogarith- 
mic correlation surface defined by (4). 

2. MOMUNTS 

The moment, / 4 *j , of the semi-logarithmic correlation sur- 
face aliout the point and is given by 

■“CO d 

= ^ t e~ I c//- 


(7) 



56 


lo.garithmic frequency distribution 


where of/ 


■‘"Yt)' 


if k is even 


if k is odd. 


Using relation (7), we can easily calculate the following six 
moments about the mean oi u ^ ^ and the mean of : 

sJ^ 

Moj’‘ ^ + <a)=0 

rAcbe^ 

Now, we want to solve these equations for the six param- 
eters, As before, we let write ^ 

Again, we have u) as the only real root of the cubic. 

, (9) 

The six parameters of tne semi-logarithmic correlation surface 
can be written 'as : 


A •* 
c^CJo^ coj^ 

^ /-^ )ir^ 


::! K/litL±£ 

‘ y 


-y^// r co-jjl 

T 



P.tli-TSI YU. IX 


57 


which furnish -u.s a simple piacticul method for cleterminiiv< tin; 
parameters of the semi-logarithmic correlation mu face for ob- 
served data. 


3. REGRBSSION OF THE MEAN 

First, let u-s observe that the function may be put 

into the following forms ; 












■27r,/FP cA'-d) A 








I 


'-T-f 


Hence, the distribution of a for a particulai array of is noimal . 


&, Tsdr — jr-~ e '' 


— Hz/ E 4 

v{a 


(H) 


and the distribution ol v for a particular array of c/ ih loga- 
rithmic ; 








rV 

i^ya-rvY'^ , 


( 12 ) 


To find the mean of u for a particular value of i/ , we mul- 
tiply a and integrate the resulting e^fpression with 

respect to u over the range from - to oc? , 


u «■ ^ u {'uj yj <^u » ^ ^ ) 



58 


LOGARITHM iC PRBQUBNCy DISTRIBUTION 


which is the regression equation of the mean of ^ on i/ and 
may be called the logarithmic regression equation. 

Similarly, the regression equation of the mean of on 6^ is 
found to he' 

^ ^ ( 14 ) 

which may be named exjxmential regression equation. 

Observe the following points; 

(a) The regression curves (13) and (14) intersect at the 
IKjint 

-j- 

\/*de • 

{h) When r^CP . the curves become two straight lines; 


a ^ » W{2 

Vs t>e ^ TT)^ 


which show that a is independent of i/ and v is independent 
of u . We can also see this from the expression whiclv 

becomes 










when r»0 . This is the condition for independence of u and 
V in a probability sense. 

(c) When , these two regression curves coincide, This 
signifies that there exists a complete functional relationship be- 
tween a and i/' , namely : 


u-r _ 1 




i\'iii^rsi \ if AX 


59 


( (1 } Av \vc lia\c learned from the ‘=:tudie.s on the normal 
cnnelalinu surface^ r is the coefficient of correlation measurinjj 
tlie lineal relationship between x- and y , 

Thiis.it is al.so a measure of relationships (13) and (14) c^istinf^ 
between a and [/ , If we note that r may *l)e written as 

'' <^U ^ 


- 






(15) 


wc sec tliat r is always greater than <3^^ , which would 

be the coefficient of correlatifin nieasuiing the linear relationship 
between u and u , if we treated the correlation surface of and 
V as i icing normal. 

'rile smaller the \rilue of C and^C^^ . the smaller the differ- 
wicc lictweeii r and Mh / show, as we 
did tor one variable case, that. as c goes to zero the semi-loga- 
iithmtc correlation surface approaches the normal correlation 
burfacc. 

Incidentally* we may remark that the expression (IS) is 
convenient for computing r . 


4, KB^GRnSSlOH OP THU MOMENTS 

Using thjE welUknown formulae for the moments of the nor- 
mal curve of enor about the mean, we can find at once the 
moment of ^^^ty^about its mean: 


0* 

yjc^u 


-00 










( 16 ) 


if ^ is even 
if <5 is odd 


This is the regression equation of the moment of u about 



60 


logarithmic PRBQvmcy distribution 


the mean on 1/ . It follows that the sTh standard moment of u 
for a given value of v” is ; 






S.-U 




^.u 


jS/ 


if J is even 


(17) 


X 




if ^ is odd. 


Again, by the formulae given in Part I for the moments of 
the logarithmic distribution, we calculate the vj ^^^moment 
about the point : 


M' = f f ^ 

^ ^ u.r. s^c-^r 


,jyS^scr 


V/-rV 


(18) 


And the regression equation of the moment of i/ about the 
mean on a is : 


(19) 


01^ 

yV*'- /'u^ yjc^y 

k^o 

The standard moment of v for a particular value of u is> 
therefore, 

M 

^ ' tr.* 


/ c-j}^VVe 

M dcmp 


( 20 ) 

J j J/? ’ 

Having obtained the expressions for the regressions of the 
tnonvents of one variable on the other, we shall now proceed to 



PAB^-rsi YUAN 


61 


discuss the scedasticity, clisy and synagic* of the semi-logarithmic 
correlation surface. 

S. SCEDASTICITY 

From formula (16), we have the regression of the second 
moment of a about the mean on [/ , 

( 21 ) 


which is the same as in the case of the normal correlation surface, 
except that r now does not measure the linear relationship be- 
tween u and V , Since (21) is free of i/ , the semi-logarithmic 
correlation surface is homoscedastic, so far as the variable a is 
concerned. 

From the standpoint of estimation, we may also interpret 
expression (21) to mean that when we estimate the mean value 
of u for a particular value of i/' . the error of estimation will be 
reduced if we use formula (13) instead of the mean of the mar- 
ginal distribution oi u , The standard deviation of the marginal 
distribution of ^ is ^ , while that of (13) is only ** 
as shown by (21). 

The second moment of v for a particular value oi u h given 
by (19): 




( 22 ) 


which is not independent of u . So, the semi-logarithmic correla- 
tion surface is not homoscedastic for 1 / . Actually Ae stan- 

dard deviation of the distribution of V for a given u , increases 
with U . 

However, the relative dispersion or relative error for the 

♦The term *‘synagic*' was used by S. D. Wickaell to mean the re- 
gression of the kurtosis. (*'The Correlation Function of Type A, and the 
Regression of its Characteristics”, Kunfe^l, Svenska Vetenskapsakademiens 
Handlingar, Band 58, Nr. 3; Meddelanden fran lyunds Observatorium, Ser. 
II, Nr. 17, 1917) 



62 


logarithmic frequency distribution 


-distributions of v" for different values'of is a constant, namely : 




(23) 


Thus, by using formula (14) to estiniate the mean value of v/ 
for a given value of u instead of employing the mean of the 
marginal distribution of i/ , we i educe the relative error of esti- 
mation, for the relative error of the marginal distribution is 
The reduction of rielative error is much pronounced 
when r is large. In fact, the greater r is, the greater the reduc- 
tion of relative error and the better the estimation. Hence, 
measures the degree of relationships (13) and (14) between u 
and V* 

6. CUSY AND SVNAGIC 

Now, we shall study the clisy and synagic of the semi-loga- 
rithmic correlation surface or the regression of the skewness and 
kurtosis of one variable on the other. 

The skewness and kurtosis of any distribution represented 
by measured and are, of course, 

equal to zero^ since it is a normal distribution. But the skewness 
and kurtosis of any, distribution of \/ for particular values of u , 
according to formula (20). are given by : 




which are two constants. Since the skewness atvjl kurtosis of the 
marginal distribution of y are given by and 

(€ -IJfe ^ Je , respectively, we may say that 



PAB-^TSl YUAN 


63 


the distribution of i/ for each array of u has smaller skewness 
and kurtosis, and is, therefore, closer to the normal distribution 
than the marginal distribution of , And it is more so, when r 
is near unity. 

7. REGRESSION OF OTHER CHARACTERISTICS 

In this section, we shall give the regression of other charac- 
teristics, such as the median, the geometric mean, the mode, the 
points of inflection and the finite lirhit. 

The regression equation of the median and the mode of u 
on are, of course, the same as that of the mean of on , 
because normal. The points of inflection of 

points one standard deviation, i.e., to the left and the right 

of the mean, as this is again a well-known property of the normal 
distribution, 

. The regression equation of the median and the geometric 
mean of on c/ is given by 

which differs from the regression equation of the mean or the 
median of 4 ^^ on , only in that the constant factor is on the 
left member of equation (13) but is on the right member of (26). 

The mode of v for special values of U is 

^ . { 21 ) 

The regression equations of the points of inflection of v' on 
u are given by 


which are not free of u . 



(H WGARITHMIC PREQVmCY DJSTRIBVTION 

Finally, we may add that the finite limit of any distribution 
of for a particular array of is the same as that of the mar- 
ginal distribution of v . 

8. llLVS'tRAriON 

For illustrating the application of the semi-logarithmic cor- 
relation surface, we take the correlation table of heights and 
weights of 11,382 school boys between 5 and 14 years of age in 
Glasgow from Isserlis's paper, ^*On the Partial Correlation 
Ratio”,* We shall treat the height as the variable u. and the 
weight as the variable . Thus, the marginal distribution of, 

heights is supposed to be normal, while that of weights is sup- 

posed to he logarithmic. 

Letting the class marks, 49 inches ^ and 56 jxjunds, be the 

origins of u and v' , respectively, and the class intervals be the 

respective units, we calculate the moments of this correlation 
surface:**^ 

class intervals 
»■ /, T632 class intervals 

= .01 r? 

- .2034lk class intervals 

= 2 5781 class intervals 

•k^3 - .S915 
«J: 1221 
« 4:205875 

(I'om which wc deduce the following parameters by formulae 

( 10 ); 

7 = ~ 511861 class intervals 

= /. 7631 class intervals 

1.0379 
c . 1929 
s ^-13.45 
b = 13.00 
^ - .9340 


class intervals 
class intervals 



FAH-Tbl YUAS 


65 


TABLE VI 

Correlation Table of Heights and Weights of 11,382 School Boys 


between 5 and 14 Years of Age in Glasgow 
(Original Measuremems of Heights Made to Nearest Inch; 
Original Measurements of Weights Made to Nearest Pound) 



With thcbe parameters, the correlation surface of heights and 


weights lb determined. Now, we shall examine the regression 
curves of this correlation suiface 

Inserting the computed parameters in formulae (13) and 
(14), we obtain the regression equations of the mean height on 
weight and the mean weight on height. In Tables VII and VIII, 
we have the mean heights for si>ecified weights and the mean 
weights for specified heights. We see, from these tables and from 
figures III and IV, the agreement the theoretical and observed 
results is very excellent In some extreme classes the deviations 
of, the observed values from the theoretical values are more pro- 
nounced. But these classes comprise only a small fraction of the 
total number of cases. 





(k) J.OGARITHMIC FREQUENCY DISTRIBUTION 

Now, we go further to investigate the scedasticity of the 
correlation surface of heights and weights. According to the 
theory, for any particular weight the standard deviation of heights 

should be a constant and equal to ^ J . S&93 inches* 

This is much less than the standard deviation of the marginal dis- 
triljution of the heights, which is 5.2893 inches. That 1.8893 inches 
is quite close to the observed standard deviations is shown by 
Table JX and Tigure V. 

The theory asserts that the dispersion of weights is not the 
same for different heights* Hut for all arrays of heights the 
relative dispersion or relative error of weights is inde\>endent of 
heights. 

TABLE VII 


The Mean Heigh b for Specified Weights 



Mean Height (Inches) 

Weight (Pounds) 

Observed 

Theoretical 

24- 28 

34.4 

33 2 

29- 33 

36.5 

36.4 

34- 38 

39.3 

, 39.3 

39- 43 

41.8 

41.9 

44- 48 

44.0 

44.2 

49- 53 

46.4 

46.4 

54- 58 

48.5 

48.3 

59- 63 

50.5 

50*2 

64- 68 

52.1 

51*9 

69- 73 

53.2 

53.5 

74- 78 

54,9 

55,0 

79- 83 

56.0 j 

56.4 

84- 88 

57.1 

57.8 

89- 93 

58.4 

59.1 

94- 98 

58.8 

60.3 

99-103 

60.7 

61,5 

104-108 

60.6 

62*6 

109-113 

61.0 

63.6 

' 114-118 

. . * . 

64,7 

119-123 

63.0 

65.7 




Gf=- f-§£fGHTS ( M£:AN C ff</C.i-l£S ^ 


pab^tsi yuan 


67 


f/ 0 ^/ee 2 zr 

J^SGffCSSiON CUR^E OF MEAN 
HEIGHT ON miGHT 

h oaSetVeD VAl,U£ 


FIGURE IZ 

REGRESSION CURVE OF MEAN 
v/eight on height 

uoosetvep VAuuE 



weight (POUNDS) 



HEIGHT (inches) 


FIGURE 7 

CURVE OF SCePASTlCiTY 
OF HEIGHT ON WEIGHT 

* OaSRRVED VALUE 


FIGURE V 

curve Of SCEOASTICITY 
OF WEIGHT ON HEIGHT 




68 


logarithmic PRBQVBNCV OfSTRinUTION' 


TABLE Vril 


The Mean sVeights for Specified 


Mean Weight (Poumtb) 


Height (Inches) 


Observed 


Theoretical 


30-32 

33-35 

36-38 

39-41 

42-44 

45-47 

48-50 

51 - 53 , 

54-56 

57-59 

60-62 

63-65 


29 8 
32 5 

36.4 
39.7 

44.6 

50.4 
57 3 
6S 1 

72.6 
81 7 

89.1 

92.2 


26 0 

30.0 

34.4 
39 3 

44.7 

50.8 

57.5 
64 8 

73.1 

82.1 
92.2 
103.3 


According to formula (23) ♦ for any specified height, the relative 
error of weights is 7.6%, which is much smaller than the relative 


error of the marginal distribution of weights, which is 


Both the theoretical and oiiserved absolute errors or standard 
deviations of weights for specified heights have been calculated 
and are shown in Table X and Figure AT. The agreement between 
the theoretical and observed dispersions is not as good as for the 
regression of the mean weight on height. It should be noted here 
that theoretically the standard deviations of weights for heights 
over 76 inches are greater than the standard deviation of the 
marginal distribution of weights, which is 12.8905 pounds. 

In interpreting the standard deviations of weights for par- 
ticular heights, we must bear in mind that the* distribution of 
weights for any given height is not normal, but logarithmic. 
Hence, a proper interpretation of the dispersion of weights for a 
given height can be made only with reference to the skewness, 
masured by the third standard moment of weights, which, afccord- 
ing to the theory, is a constant for all different heights. The 
theoretical third standard moment of the distribution of weights 
for any given height, as we shall see later, is approximately ,2. 


PAB'TSI yuan 


69 


TABLE IX 


The Standard Deviations of Heights for Specified Weights 



Standard Deviation of Heights 
(Inches) 

Weight (Pounds) 

Observed 

Theoretical 

24- 28 

3 52 

, 189 

29- 33 

2.40 

189 

34- 38 

1.91 

1.89 

39- 43 

2.12 

1.89 

44- 48 

1.91 

1,89 

49- 53 

2.07 

1.89 

S4- S8 

2.04 

,1.89 

59- 63 

181 

189 

64- 68 

187 

1 89 

69- 73 

1,79 

1.89 

74- 78 

192 i 

1.89 

79- 83 

1.95 

1.89 

84- 88 

218 

189 

89- 93 

201 : 

1.89 

94- 98 

186 ' 

189 

99-103 

I 62 

1.89 

104-108 

2 34 

L89 

109-113 

0 

1.89 

114-118 


1.89 

119-123 

0 

1.89 


TABLE X 


The Standard Deviations ot Weights tor Specified Heights 



! Standard Deviation of Weights 

1 (Pounds) 

Height (Inches) 

Observed 

Theoretical 

30-32 

4.6 

28 

33-35 

45 

3,1 

36-38 

4,0 

3.5 

39-41 

3.5 

3.8 

42-44' 

3.6 

4.3 

4S-47 

4,2 

47 

48-50 

4.8 ! 

5.2 

51-53 

5,9 

5.8 

54-56 

6.3 

6,4 

57-59 

84 

7.1 

60-62 

12.5 

7.9 

63-65 . 

14 8 

8.7 





70 


LOGARITHMIC PRHQUBKCY DISTRIBUTION 


I'hiis, from Table II in Part I, -we find that the probability that 
any weight will be at most one standard deviation ahne or below 
the mean weight for a given height is 68d9 instead of .6826, as 
in the case of the normal distribution The difference between 
6839 and .6826 is slight but should not be overlooked. More- 
over, the difference would not be so small, if the skewness were 
larger. 

Another thing we must observe is that since the standard 
deviation of weights for a given height increases with height, the 
probability that for a given height the weight will differ from the 
mean weight for that height by, sa>. at most one pound is not 
the same foi all different heights, although the probability that 
for a given height the weight will differ from the mean weight 
for that height by at most one standaid deviation is the same for 
all different heights. The former probability is greater for smaller 
heights. 

The agreement between the theoretical and observed clisy and 
synagic is, of course, not expected to be dose. Tlieordically, the 
distributions of weights for specified heights should all 

and observed values of und arc 

shown below: 


Height 

(Inchest 

Observed SkevMie^s 
of Weights 

•C3.J/ 

Obi^ei ved KurtoMs 

1 of Weighu 

Ou 

36^38 

22 

8.72 

4244 

.10 

.18 

48-50 

29, 

.79 

54-56 

.12 ^ 

1,54 

60-62 

-.93 

oO 



PAB-TSl YUAN 


71 


The rather large deviations of the observed in the first 

class from the theoretical and the observed ^ in the last 

class from the theoretical may be accounted for by the fact 

that only 350 and 69 observations are included in the first and 

the last classes, respectively 

The observed marginal distribution of heights is very sym- 
metric but is markedly leptokurtic> since its is about 2.S093, 
Hence, the fit given by a normal curve is not quite satisfactory, 
as we can see from Table XL 

The observed marginal distnliution of weights is quite skew 
and platykurtic. As shown by Table XII, the agreement between 
the observed distribution and the theoretical logarithmic distribu- 
tion is not very close 


TABLE XI 


Relative Frequency Distribution of Heights of 11,382 School Boys 
between 4 and 15 Years of Age in Glasgow 


Class 

Limits 
( Inches ) 

1 Observed 

Relative 
! Frequenc> 

Theoretical 

Relative Frequenc) 

' (N'ormal Curve) 

27-29 


.0003 

30-32 

^ .0007 

’ ,0020 

33-3S 

.0063 

0095 

36-38 " 

.0308 

.0332 

39-41 

1048 

0846 

42-44 

.1682 

1577 

45-47 

.1913 

.2154 

48-50 

.1929 

2143 

51-53 

.1681 

.1561 

54-56 

.0980 

.0831 

57-59 

.0317 j 

.0324 

60-62 

.0061 

.0092 

63-65 

0011 

.0019 

66-69 


.0003 

Total 

1.0000 

1.0000 










72 


LOGARITHMIC FREQUBNCY DISTRIBUTION 


tabu XII 


Relative Frequency Distribution of Weights of 11,382 School Boys between 
4 and 13 Years of Age in Glasgow 


Class 

Limits 
( Pounds) 

Observed 

Relative 

Frequency 

Theoretical 
j Relative Frequency 
t (Logarithmic Curve) 

19- 23 


|||||||||■[^^[|^^^■■ 

24- 28 

0014 


29- 33 

.0119 

.0211 

34- 38 

0640 

0564 

39- 43 

,1297 

.1041 

44- 48 

.1454 

,1441 

49- 53 

,1499 

1609 

' 54- 58 

,1363 

.1504 

59- 63 

.1139 

,1234 

64- 68 

.0882 

.0897 

69- 7.3 

0682 

.0602 

74- 78 

0461 

0373 

79- 83 

.0213 

.0212 

’ 84- 88 ' 

012! 

0127 

89- 93 

.0058 

.0065 

94- 98 

0033 

0033 


.0017 

.0018 


0006 

.0008 

109-113 

.0001 

.0004 

114-118 


.0002 

119-123 

' .0001 

.0001 

Total 

1,0000 

1.0000 


In closing, we may say that the semi-logarithmic correlation 
surface is not at all nncommon in practice, and the method de- 
veloped here for treating it should prove rather useful. In fact, 
our investigation opens up a new way for determining exponential 
and logarithmic regression curves. 






PAB-TSJ yUA\ 


73 


B^BL30GRAPH^ 


This Bibliography contains the names of the Ixxiks and articles which 
deal with the logarithmic frequency distribution, as well as those which are 
related to the general problem of the transformation of frequency distri- 
butions. But no attempt has been made to include all the literature con- 
cerning the transformation of frequency distributions. 

Anderson, Waw'CK — R esearches into the Theory of Regression, pn. 126- 
148 and pp. 153-154 (Mecldelande fr5n Lunds Astronomiska Observa- 
torium, Sen II, Nr. 1932) 

Baker. G. — Transformations ot Binioclal Distributions ( Annuls ot Math- 
ematical Statistics, Vol I, No. 4, 1930, pp. 334-44) 

Bernstein, S — Snr les Combes de Distribution des Prohnhilitos ('Mailic- 
matische Zeitschrift, Vol 24. 1925, pp 199-211) 

Bruns, H. — Wahrscheinliclikcitsrechnung und Kollektivniasslehre, pp, 126- 
139 (B. G. Teubner, Leipzig and Berlin, 1906) 

Charlier, C V L. — Das Strahlungsgesetz (Meddelande frin Lunds Astro- 
nomiskd Observatorium, No 55, 1913 or .Arkiv fdt Matematik, Astru- 
nonii och Fysik, Band 9, No. 11) 

Contributions to the Mathematical Theorv of Statistics, 7. Frec|U(.;iu.\ 
Curves of Compound Function^ ('Mecldelande friin Lunds Astronomiska 
Observatorium, No. 61, 1914) > 

D.wiEs, George R — The Logarithmic Curve of Distribution (Journal of 
the American Statistical Assikiation, Vol XX, 1925, pp. 467-480) 

The Analysis of Frequency Distributions (Journal of the American 
Statistical Association, Vol XXIV, 1929, pp 349-366) 

Dodd, Edward L* — -The Frequency Law of a Function of V'ariahles \»,ith 
Given Freciiiency Laws (Annals of Mathematics, Second Series, Vol. 
27. 1926, pp 12-20) 

Edgeworth, E. F. — On Representation of Statistic.s b\ iVLithenwtics (j. R. 
S. S., Vol. LXI, 1898, pp. 670-700) 

A Method of Representing Statistics by Analytic Geometrs (Proceed- 
ings of the Fifth International Congress of Mathematician b, Vol, Tl. 
1912, pp 427-40) 

On the Use of Analytic Geometry to Represent Certain Kinds of Sta- 
tistics (J. R. S. S.. Vol. LXXVII, 1914, pp. 300-312, pp 415-432. pp 
653-671. pp. 838-852) 

On Mathematical Representation of Statistical Data (J. R. S. S., Vol 
LXXIX. 1916. pp. 455-500; Vol LXXX, 1917, pp. 65-83, pp. 266-288, 
pp, 4ID437) 

Fisher, Arne* — The Mathematical Theory of Probabiliticb, pp 235-260 
(The MacMillan Co., New York, 1928) 

Galton, Francis — The Geometric Mean in Vital and Social Statistics. 
(Proc. of the Royal Society, Vol, 29, 1879, pp. 365-367) 

Jenkins, Thomas N. — ^A Short Method and Tables for the Calculation of 
the Average and the Standard Deviation of Logarithmic Distribution 
(Annals of Mathematical Statistics, Vol. Ill, No, 1, 1932, pp. 45-55) 


♦Mistakes have been found in the results given by Arne Fisher and Jor- 
gensen. 



74 


LOGARITHMIC TRliQVEXCY HISTRl RVTIOR 


N R* — Undei s^gelscr cncr Freqncn'^fladcr Korrelation. pp 
45ol (Ainnkl Bu^ck IQ16) 

KAi*T'fiYv. J C — Skew F^e^)uenc^ Curves in Biolop:\ and Statistics, (Nnord- 
linff, (ironingen, 1003^ 

Replv to Professor Pcarbon’s Criticisms (Recueil ties Travaux Bota- 
niquc Nearlanclais, ^’oK 2, 1906, pp. 216-222) 

Skew Frequenc\ Cun'cs in Biology and Statistics. 2ncl paper, Ch. 1 
(Hoitsema Brothers. Ginningen, 1916) 

McAivUS'rT:!^, Don \n)— The Luw' of the Oumutne Mean f Pro<. of the 
Royal Society, \’oi 29. 1879, pp 367-^76) 

NydiCu., St^rii: — T he Mean Rrrors of the ChnracttM istu s m the l.ogaritli- 
mic-\orma! Distnbutioii (Skancliua\ isk Aktuanetid'=!kuft, 1919, pp 
134-144) 

Pearson, K\Ri. — ‘‘D as Fehlergeset? und Seine Vera 11 gcineine rung er durch 
Fechner und Pearsoidk a Rejoinder f Bujinetrika, \''oI r\k 1905. pp 
170-212) _ . 

Skew Frequency Curveb, A. Rejoinder to Profes->oi Kaptcyn CBio 
inetrika. Vol \\ 1906 pp, 168-171) 

Pretoruis, S J — Skew Bivariate Frequenc} Surfaccb Kxammed m the 
ivight of Kumerical Illustrations (BiometriKa, \'(»1, XXTI. 1930. pp 
109-223) 

Riet^, H 1, - Fre(|uenr) Distributions Obtained d)y Certain Tianv,iornia- 
tions of Xornialh Distributed Variates (AiinaN of Matl^ematlc^. Second 
Series. 1923. pp 292-300)^ 

On Certain Properties of I'retiuencv Dibtnbutiunb oi the Powers and 
Root*' oT Vanates of a Given Distribution (Proc oi tlu* Xational 
Acadeins. Vol. 13, Ko, 12, 1927, pp, 817-820) 

On Certain Prope^tIe^ of Frequency Distributions Obtained by a 
Linear Fractional Transformation of the Variates of a Given Distri- 
bution (Aniidia of Math Statistics. A’ol IL No 1, 1931, pp. 38-47) 

V VN LhKN, M, J — Skew Frequcnc\ Curves in Biologx nml Statistics, 2nd 
Paper, Cbs II and III (Hoitsema Brothers, Groningen, 1916) 
Logarithmic Frequency Distribution (Proc. Koninklijke Akadcmic \an 
Wetenschappen, Amsterdam, Vol XIX, No. 3, 1917, pp 533-546) 

Skew Frequency Curves (Proc Kon Ak v. Wet, Amsterdam. Vol. 
XiX. No. 3, 1917. pp. 670-684) 

On Treating Skew» Correlation (Proc Kon. Ak. v. Wct.^ Amsterdam, 
Vol. XXVIII, Nos 8-9, 1925, pp. 797-811; No 10, pp. 919-935* Vol, 
XXIX, No. 4. 1926, pp 580-590; Vol XXXII, No. 4, 1929, pp. 408- 
413) 

Skew Correlation between Three and More Variable^? fPioc Kon. Ak. 
V. Wet, Amsterdam. Vol XXXII, No 6, 1929. pp. 793-807, Ko 7. pp 
995-1007, No. 8. pp 1085-1103) 

Wtckset.i,, S D, — On the Genetic Theory of Frequency (Arkiv for Matc- 
matik, Astrononn ocb FysiW, Band 12, Ko, 20, 1917) 

On Logarithmic Coi relation with an Application to the Distribution of 
Ages at First Marriage (Meddelandq fran Lunds Astronomiska Obser- 
vatorium, No 84, 1917) 

pas Heiratsalter m Schweden^ 1891M910, Ein Korrelationsstatistischc 
Untersuchung (Lunds Universitets Arsskrift, N, F, Avd. 2, Bd, 14, 
Nr. 18. 1918 or KungJ. Fysiograhska SalLkpcts Handlingar, N, F Bd, 


^Mistakes have been found in the results given bv Arne Fisher and Jor- 
gensen. 



A SIMPLE METHOD FOR CALCULATING 
MEAN SQUARE CONTINGENCY 

Ry 

PwLM^R B, KoVPk 
The Ohio State Univermtv 

If we wish to for a p/issihle relation.ship between two 
varialiles which are not quantitatively meaMirable, but each of 
which has two or more cate^mries, the usual ])roce(lure is to make 
a two-way table, the frequencies of all the possible com- 

Immtions. 

•\ssumiiij^ independence between the two variables, a second 
table is built, makinf^ the frequencies of each column proptjrtional 
to the* frequencies in the column of row totals. When this is done, 
each of the row frequencies is found to be ]>roportif)nal to the row* 
of column totals. 

The deviation of the actual frequency for a compartment 
found in Table 1 from the cKpected f requeue) as found in Tabic 
2, is squared and this stjuare is divided by the expected frequency. 
These quotients are summed o\‘er the entire table, ^dving us Chi- 
square. 

The calculation of Chi-sqnare can be made much simpler by 
simplifying the formula. 

The jwohability of the occurrence of two iudependein events 
is the product of their separate probabilities. Thus the probability 
of the joint occurrence of Category 3 of the first classification and 
Category d of the .second clas.sification i.s the probability of the 
occurrence Category 3 (which is taken to be the fraction of the 
total number of cases which fall in Category 3), times the prob- 
ability of the occurrence of Category d. The expected frequency 
of the compartment is this product of separate probabilities, mul- 
tiplied by the total number of cases. If vve let be the actual 
compartment frequency, ^ the expected compartment frequency, 



76 CALCUUTING MEAN SQUARE CONTINGENCY 

fp the total frequency for the roWj the total frequency for the 
column, and N the number of cases, we may write, 


Also, 





(Ij 




fe 

(2) 

te 


\ 


Since the table must sum to N, whether wc have filled it with 
actual frequencies, or with expected frequeneieSi (2) reduces to, 

% 

Substituting for from (1), 

z 

. X-E~ -N 

fpfp 


■N 




( 3 ) 



niMllR /?, ROYER 


n 


fn order to illustrate our method, we shall compute Chi-square 
for Table 18 on page '86 of Fisher's Statistical Methods for Re- 
search IVorkers, Our computations are presented in the following 
talile 



I 

JS T*, 

Sciq 

I 

U »c 

Cm 

\ "" 

1 ■■ ■ ' 

2;5 

, - 1 

-Q 

Coiiphiij? 

' 8R 

82 

1 ^ 

75 

I ; 

60 

305 

F, Ma)e> 

7744 

25 3902 

24 

22.0459 

5625 

18 4426 

,1600 

1 1 8055 



■■ 

U 

)0 

l^gm 

123 

Ft nmies 

hsh 

1156 

')(K) 



1 

DH 

9..1<!84 

7.5171 

mmi 


RepuKioii 




1 — 

418 

F, Males 


W.4') 

2(16914 

64iin 

15.5) 10 

16^)00 

40 4306 

1 _ 

1 

% 

«<s 

■■ 

79 

358' 

Fi Feimik's , 

9216 

25 7430 i 

. - - 1 

7744 

21.6313 i 

- 1 


6241 

17,4330 

1 

Frt'qut'ucy 

■ ■ 1 

1 

■ — 



Total 

337 

297 

280 

290 

1204 

Product 

' ! 


1 



Total 

9*1,5118 

73 7670 

()(i 2802 

73.2523 


Quotieni 

.280450 

i 

248374 

.236715 

.252594 

, 1.018133 


The actual frequency is the first entry in each compartment. 
The square is read from a table and written directly beneath the 
frequency. The reciprocal of 305 is put into the keyboard of a 
calculator and multiplied in turn by each of the squares in the 
first row. The products make the third entries in the compart- 
ments. 








































CALCULATING MEAN SQUARE CONTINGENCy 


These products are summed by columns and the sums divided 
by the frequency totals of the corresponding columns. These 

■P 


quotients are summed horizontally. This sum is 27 # ^vnd can 


be substituted in (3). For our example, 

l£d>4 (^1. 018133 - 1. OOOOOa) 
332 


This answer agrees exactly with the answer obtained by 
Fisher in his Table 19, page 87. The advantages of this method 
ere two- fold: (2) There is consfderahJe saving of labor; (2) 
with the simplification of calculations, we have greatly reduced 
the danger of errors caused by dropping of decimal places. 



THE AMERICAN STATISTICAL ASSOCIATION 


Vol. IV 


No. 2 


THE ANNALS 

of 

MATHEMATICAL 

STATISTICS 


SIX DOLLARS PER ANNUM 


MAY' 





PUBLISHED QUARTERLY BY 
AMERICAN STATISTICAL ASSOCIATION 


O^tVe-^Edwards Brothers, Inc,, Ann Arbor, Michigan 
Bu^ness Office^-Wi Cortimerce Bldg., New York Univ., New York, N. Y. 

Entered as second class matter at the Pojfo#*c^ at Ann Arbor, MioK, 
r under the Act of March 3rd, 1379, 


STATEMENT OF OWNERSHIP 
UNDER ACT OF CONGRESS OF AUGUST 24, 1912 

FwNijftffr^ATnerican StJttistical Association, New York City, New York. 

. C, Carver, University of Midiigan* 

Mo^gmg S. Sekhon, Ann Arbor, Michigan. 

Mamger — L W- Edwards, Ann Arbor, Michigan. 

, rAtnerican Statistical Association, 530 Commerce Bldg., New York 

Cit'y 



A METHOD OF DETERMINING THE CONSTANTS 
IN THE BIMODAL FOURTH DEGREE 
EXPONENTIAL FUNCTION 

By 

A. U O'TooU 

In a paper in this JournaF the present writer has discussed 
some of the mathematical properties of a class of definite integrals 
which arise in the study of the frequency function 

This function defines the system of frequency curves for which 
the method of moments is the best method of fitting- — i.e. best in 
the sense of maximum likelihood — ^and this fact gives importance 
to its study, The curves are typically bimodal, the nature and 
location of the modes being given by the roots of the equation 


( 2 ) 

The first problem which arose was that of finding an expreS' 
sion for the value of the definite integral 


(3) 
If a; 


p Ofi? 

p 

is replaced by this integral becomes 


(4) 


4 



00 


dx^ 


^On the system of curves for which the method of moments is the 
best method of fitting. Vol. IV, No. 1, Feb. 1933, p. 1. 

2R. A. Fisher, On the mathematical foundations of theoretical sta- 
tistics, Philosophical Transactions of the Royal Society of London, vol. 
222, series A (1921), p. 3SS. 


A MBTHOD OP DETERMINING THE CONSTANTS 


or 


(5) 4- 

j where 

<00 

or, replacing by x where, a is the positive square root of 

(6) 

4 =vl 

/•oo \ 

or 


(7) 

H J dx. 

where 



No real loss of generality is incurred in studying (5), (6) 
or (7) rather than (3). For the purposes of the previous paper 
it was found convenient to discuss certain special cases of (7) 
jfirst, then (7) itself and later (5). Having in mind the practical 
purposes of this note, however, attention will be focused first on 
the form (5) and afterwards on (3). The transformations from 
the expressions obtained in the previous paper are very simple. 
For (5) the special cases studied and a few of the more impor- 
tant results obtained may be stated here as follows: ^ 

Type I : 


•CO ^ 

7,^. y rf ^X n. 0.1,2A. 

00 

^ A * 



A. 1. O’TOOLB 


81 


/ r * 


■* i arri} 


u.-il-.O, 

•’ 4 





rrV 


I___ 

4 ct^ ' 


hence 




*:j — ' 




■.ha.: 

4 


ct»raj 


, rt^ 




^r~ 


o, 


n^O, 1,2,4, 


Obviously, of course, k depends upon the total frequency and 
hence if the total frequency is 





<3^JH 


rri) ' 


This curve has a single mode located at is symmetrica! 

with respect to the ordinate at ;»&*«■ 



82 A METHOD OP DETERMINING THE CONSTANTS 
Type II: 

p=-2b , b>0, 


•’L 




a*/>^ S-Ja*A* Q-5 ia^b^Id 9-51 i 

^ 4! 6! d! 


/Yiy/- „ . .; ] 


k 

Va 






7- 3 -4^ 2./ 


) 


*aArr^)a* 


S-4 ^9-5-4‘‘ £1* 


) 


It was shown that this integral could be expressed in terms of the 

Bessel functions Jt and J^i as follows: 

4 4 



0 


sdA^jig 






where 

eirri)rri) 

/\m ^ , 

IF" 

^ Jrrjjrrij 



A. 1. O'TOOLB 


83 


If the total frequency is /V 

A/ 




rwf 


This curve is symmetrical with respect to the ordinate at x«0 
and has two real modes located at 
Type III: 


4 - 


-ii r rf^^J 

('<?r7)/ i^r~y 


k 


'rrij\u . . . 1 

' ^ 4^ 4-4! 4^-6' 4^-J^/ J 

4.6/ 4* JO/ j 


This curve is not symmetrical and has only one real mode, that 
mode being located at x equal to the real cube root of negative a . 



84 A MBTHOD OF DSTBRMINING THE CONSTANTS 
Type IV ; The general case. 


4 “ 

«4 


It wEb shown that the value of this integral could be expressed 
as an infinite series each term of which involved two Bessel func- 
tions. But, as pointed out near the close of the previous paper, 
although this infinite series may be considered a theoretical solu- 
tion of the problem, it does not lead to a simple method of 
determining the constants which appear in the frequency 

function. It is the purpose of this note to give a practical method 
of determining these constants. 

Beginning with (5) 

\ 

the moment is defined by 









( 8 ) 



85 


A. L. O'TOOLE 

Integrate by parts, letting as "^and dvs^ 

TThen 

Divide by and multiply by ^ and the result is 

(10) <^s-C4u'3+2pu'jJ. 

Start again with in the form (5) and integrate by parts letting 

u«e and dv^ ^ dz. Then 


( 11 ) /, 

J-a) 

Divide by and then 


-/* a^t'4a'^ 4- ^pa'^ 


or 


a'^m 


(12) ^ 

4a^ 4-epu^ 4<^u'^ 

Now integrate (11) by parts with u s 

d\/s{'4» *!h<epx ^4- ^x. ) This leads to 

(^96 J£epx'^4-d4<^x^44cPp'^x* 

15c^^4cV^ ^ dA^. 

Divide by and obtain 

U^C96u^4j48pu^ 4 64fU^4 40p^a^ 4dOp<^u^ 4js/uj) 



86 A MHTHOD of determining THE CONSTANTS 


or 


(14) a 


30 


96 / 1^8 ■t-84(^u'^ + 40p^u^ ^30p^u^ 

Squaring in (12) the result is 


(15) 


7T? 


64uti i-£pu^ + <fu^) 
1 


X — ^ ..I- • 

Eliminating <z'*between (14) and (15) the equation 


(16) pf40u^ -30u;V^p,ffMs -^^Ou;u^) 

- 48Ou'u^J^^04a; -840 h(08ay48up* O. 

Using relation (10) and 


(17) f. leJ/t 16 pa 'a j / 4p'^af 
eliminate ^ from (16) obtaining 

(18) J/ip^^8£p ^8C-’0 


and hence 

(19) 

where 




-6u'/-3u'u^ , 

(20; ff-90u;aX-80ufa'-88u'f^J6u'-60i,y^ -81u;a^*60u'^% 
C ’dOa'uf.dOu/u/- 48^X ^^9c/;u' uj *I8u^ -60u^^. 



A. 1. O'TOOIB 


87 


In order to decide upon one of the two values of /p furnished 
by (18) notice that, equating the first derivative of the frequency 
function to zero, the location of the two modes and the minimum 
point between them is determined by the roots of the equation 


( 21 ) ^ O. 

The condition for three real distinct roots in this equation is 

(22) which requires p< O, 

where ^ is found from (17), then one of the 

modes coincides with the minimum point. If then both 

modes coincide with the minimum point* 

Extracting the square root in (17) gives two values of ^ 
differing only in sign. Now it is easy to show either by geometri- 
cal considerations or by examining the algebraic manipulations 
leading to (18) that p is independent of the sign of . Chang- 
ing ^ to in (5) has the same effect as changing to '•;i; or, 
that is, reversing the order of the distribution and curve. Also, 
changing yn leaves the even moments unaltered but changes 

the sign of every odd moment. Hence if the value of the function 
at the modal position on the left is greater than the value of the 
function at the modal position on the right then ^ is greater than 
zero. And if the value of the function at the modal position on 
the left is less than the value of the function at the modal position 
on the right then ^ is less than zero. If the curve is sym- 
metrical with respect to the ordinate at Hence p and ^ 

are determined by (19), (17) and (22), the sign of being 
fixed by examination of the data of the problem or, if necessary, 
by trial. The value of is then found by taking the positive 
square root in (IS). Of course (14) would give the same value 
for . 

Now that y p and ^ are determined, there remains only 
k to be found. If the total frequency is N then 






88 A METHOD OF DETERMINING THE CONSTANTS 


and hence 


( 23 ) 


A/ 


where the numerical value o! the integral in the denominator can 
be found by mechanical quadrature to any desired degree of ap- 
proximation* For purposes of the quadrature involved here it will 
be found that the simple rectangle quadrature formula will give 
as good results as could be desired,® Having found /r then the 
constant r is also known since 

-ct^r 


( 24 ) 


e 












90 


The .points of inflexion are located by equating the second 
derivative of the function to zero. The equation is 

( 25 ) O. 

If now be replaced by ^ ^ then 



becomes 


( 3 ) 




*On the degree of Approximate of Certain Quadrature Formulas, 
Annals of Mathematical Statistics, vol. IV, No. 2»May 1933, p. 143 by 
A. L OToole. 



A. L, O^TOOLH 


89 


where 


(26) 


pf - 4/77 

pg « 6m^*p^ 

Pg =. 4/77^* £rnp 

- m** TT7^p * m<^ -h r. 


VP4 

The (iata‘ in the first two, columns of the table given here 
will provide the basis for an illustration of the method described 
above for determining the constants. The numbers in the first 
column are the classes into which the plants were divided. In the 
sefcond column are found the frequencies corresponding to the 
various classes. In constructing the third column the origin for 
was arbitrarily placed to correspond to the class 25. Taking 




the first SIX moments and the eighth moment are found to be 

OJ94690S, 

* ‘^7. 09735, 

975 , 

410.3540, 

= ;so 9;s2s. 5, 


^This data, except for slight modifications, was extracted from that 
of W. L. Tower on Ae Sedation of Counts of Rays of Chrysanthemum 
Leucanthetmim, Biometrika No. 1, 1901-2, p. 313, , 



A METHOD OF DETERMINING THE CONSTANTS 


90 


Ci-flss 

ESI 


y' 


WSIm 

9 


mm 

.031102 

.077755 

0 

mm 


19 

.300472 

.751180 

1 

■11 


-14 

1,583594 

3.958985 

4 

12 

1 

-13 

4.991068 

12.477670 

12 



-12 

10.246676 

25.616690 

26 

’14 


-11 

14.831798 

37.079495 

37 

IS 


-10 

16.280112 

40.700280 

41 

16 



14.482S40 

36.206350 

36 

17 



11.089036 

27.722590 

28 

18 

8 


7.712195 

19280487 

19 

19 



S.108903 

12.772257 

13 

20 

27 


3.3S9072 

8.397680 

s 

21 

43 


2269766 

S.674415 

6 

22 

34 

- 3 

1.621764 

4-054410 

4 

23 

32 

- 2 

1,252760 

3.131900 

3 

24 

31 

- 1 

1.062910 

2.657275 

3 

25 

30 

0 

1.000000 

2.500000 

3 

26 

24 

1 

1.046530 

2.616325 

3 

27 

20 

2 

1.214445 

3.036112 

3 

28 

20 

3 

1.547935 

3.869837 

4 

29 

16 

4 

2.133051 

5.332627 

5 

30 

12 

5 

3.108096 

7.770240 

8 

31 

20 

6 

4.654336 

11.635840 

12 

. 32 

20 

7 

6.917724 

17.294310 

17 

33 

30 

8 

9.793409 

24.483522 

24 

34 

21 

9 

12.S93310 

31.483275 

31 

35 

6 

10 

13.938151 

34.8453>7 

35 

36 

6 

11 

12.502565 

31.256412 

31 

37 

2 

12 

8.504388 

21.260970 

21 

38 


13 


10.196442 

10 

39 


14 

mmt 

3.185327 

3 

40 


IS 

.238029 

.595072 

1 

41 


16 

.024259 

.060647 

0 


452 


180.7M704 















11 <3 19 i% H SS 41 X 





n A METHOD OP DETERMINING THE CONSTANTS 


Formulas (20) give 

A--i40Q, 786 
B^-7904m9, 
-16354106. 


Hence from (19) 

or 


21. 48292. 

But p^’-21s4d292 and the value of ^ to which it leads do 
not satisfy the relation (22) hence use 202, 7262 . Cal- 

culate ^ from (17) and use the positive square root since an 
examination of the data shows that the value of the function at 
the left modal value is greater than the value of the function at 
the right modal value. Hen<;e 


<^^29.4284. 

Formula (15) now gives as the positive square root 

O. 000;e640. 

Using these values for a^,p, the values of the function 
y e 9^ ) 

are calculated for integral values of x, from -16 to »^i6 
and tabulated in column four. The constant k is > then found by 
dividing the total frequency 452 “by the sum of column four. 
Hence 


By (24) 


A- 

"ieo. 792704 


2. 500 too. 


r»- 3472.57a. 

The function can now be written 


ory-e" -*^441 K - 3474. S7Q ) 



A, L. O'TOOLB 


93 


The values of the ordinates for this function are given in 
column five and to the nearest integer in column six. 

Equation (21) becomes 

7d6e)xi~29. 4284= O 

which has the roots (approximate) x=40.1, 0.07 x-JO.OJ 

It should be noted that the sum of the three roots must equal 
zero. Hence the modes are located at and at 2C=^J^.03 

with the minimum point at . These roots can be deter- 

mined to any desired number of decimal places by Homer^s 
method. 

If now X is replaced by so that the new values of x 

are respectively equal to the numbers in the class column, the 
function becomes 


e 

The modes are now located at and 35.03 with the 

minimum point at x^^5.0Z. In the figure are shown the original 
distribution and the curve represented by this equation. ^ 



ON THE TCHEBYCHEF INEQUALITY OF 
BERNSTEIN 

By 

Cecti, C. CuAio' 

From Tchebychef’s inequality we’know that if x, , 
are a set of independent statistical variables with 



* 




O' f , 


then the probability P that 


iff 

satisfies the inequality, 

P> y-p • 

This gives a lower limit for P which is often unsatisfactory. 
Improvement of this result requires further hypotheses. As is 
well-known, Pearson, Camp, Guldberg, Meidel, Narumi,* and 
Smith* have attacked this problem with considerable success. An- 
other interesting and important attempt in this direction due to 
S. Bernstein seems to have generally escaped attention in the Eng- 
lish-speaking world, at least, since it has been published only in 
Russian-.* Because of the latter fact, it seems necessary to give 


'This paper was writteis. m sabstantially its present form during the 
author’s tenure of a National Research Fellowship at Stanford University. 

*For references to all these papers except Smith’s and a brief discus- 
sion see Riets, H. L., Mathematical Statistics, (Open Court Publishing 
Company, Chicago, 1927), pp. 140-144. 

‘Smith, C. D., On Grmeralized Tchebychef Inequalities in Mathemat- 
ical Statistics, American Journal of Mathematics, Vol, S2, (1930), pp. 109- 
126. 

*Bemstein, S., Theory of Probability, (Moscow, 1927), 'pp. 1S9-16S. 
The present account of this work of Bernstein is takm from a lecture of 
Professor J. V, Uspensky. 



CECIL C. CRAIG 


95 


a brief account of this work of Bernstein’s preliminary to the 
remarks based on it the writer wishes to make. 

Bernstein imposed the condition in addition that 

( 1 ) .., 77 , 

rwd “the mathematical expectation of in which h 
is arbitrary. (This condition is satisfied, e.g., if the i are 
bounded.) and used the following lemma due to Tchebychef. Let 
the statistical variable u be always >0. If then the 

probability Q that u satisfies the inequality, 

Then taking. 






Cx., £)£. 
xz e ' e • 

■e « 

in which C is arbitrary, 

eruj . sre • • ■ £('e 

and under the condition (1), 




' 2 


If it is assumed that 


then 

\£\^i c< I 

cW, 

i 

and thus 

(2) 

eru)< 

* 


If in the inequality, u , a greater quantity is substituted 



96 TCHBBYCHBP INBQUAUTY OP BSRNSTBIN 

for A , then certainly . Therefore the probability Q of 




u>e 


satisfies the inequality 


e 


Now 


un e ' "* > e 

implies for £>0, 

£<3^ 

The value of C is next chosen so as to make Q a minimum, i.e., 


so as to make? .* 




I AO-c) 


a minimum. Thus 




^hen the probability Q that 


/ z 

satisfies the inequality, 


j? i 

n t /-c -/ 


It t « — c<i. 

To get the corresponding result for the lower limit of the 
sum . . . . ^ , it is only necessary to choose 

€<0 zt\d as before, the probability, Q ' , that 



CECIL C. CRAIG 


97 


satisfies the inequality, 




if also and l£|:5 f- with c<i. 

Combining these two results, if P is the probability of 

\ 

then since 

p-h Q->-Q' = I, 


Pi l-£e 


-TT 


£ 


But setting 





^ and abo> 


the condition 






2<l-c) 


TT^S 




• 2 

(Bernstein set <f “ jp- . merely the equality sign in this 

condition. The value of c as here given is necessary in the 
author’s developments below.) must be satisfied, or what is the 
same thing, 

2Ci- 

2ct* ^ ’ 


Ci 


*’/ 7 CO 


from which 



98 TCHBBVCHBF INHQVAUTY OP BERNSTEIN 

This last quantity on the right is positive and < i as required so 
that the constants can actually be chosen as specified. 

This gives ^ 

and finally the probability, P ^ that 

^ a; 

is such that 

P>l-ee , 

ar setting oj ^ ter 


(3) P> . 

It is to be observed that generally the quantity rapidly de- 

creases with increasing ?? . 

This is the inequality reached by Bernstein under the condi- 
tion (1). 

^ If all the/^-c? are bounded, if, say, always 

k/U A 


one may take h 


3 ‘ 


It is the purpose of the author s remarks to discuss less severe 
conditions than (1) under which the ^ inequality (3) can be ob- 
tained. These more general conditions are obtained, however, at 
the expense of assuming quite generally satisfied regularity con- 
ditions with regard to the “tails” of the frequency distribution 
of X , which needs not necessarily to be regarded as the sum of 
n com]X)nent. variables, , 

If we now take 


H) 


<f X 


e 



CBCIL C. CRAIG 


99 


we have 


the probability function of -< A 

/a. 

The condition (1) insures that the series under the sign of inte- 
gration may be integrated over the interval oo ) . But the 

series can also be integrated over the same interval if it converges 
uniformly in any fixed finite interval, which it does, and if the 

('y^ ’ 

y -/? 

converges uniformly in the interval an, anj . 

Formally, at least, 

( 5 ) , 

in which is the moment about the mean of -u . If 

( 6 ) ^ h'* ^ 

for some h >0 , then for h | | ~ c < i the right hand side 

of (5) is convergent and is < / y- as before. Now 

let us suppose that the condition (6) is satisfied not only for 
moments taken over the whole interval (' ~oa, otf ) but also for 
moments taken over any interval which includes the interval 
in which b is an arbitrarily large though finite number. This is 
the first rcijulnrity conditiou , mentioned above, which we shall 
impose on the tails of the frequency function of x . 




100 


TCHBBYCHBF INBQUALITY OF BERNSTEIN 


Then it is obvious, from the remark above, that 


is uniformly convergent in the interval for \ y\<b for J !<f I- c <1 
And for” lyl > ^ it is also obvious that for 1^1 ^ c < J , 


^ 9r> 
n^o ^ " 


is uniformly convergent if our first regularity condition is satisfied. 
And since \€\ may be taken arbitrarily small, the inequality (3) 
follows as ])efore. 

It is evident that if our first regularity condition holds, that 
the condition (6) is more general than the condition (1). And 
it is easily seen that this first regularity condition holds for a very 
wide class of frequency functions. For, in order for it to hold, 
it is sufficient that the frequency curve (continuous or not) out- 
side some finite interval about the mean as center, be never 

increasing as \x\ increases and that if f /V be the ordinate of 
the frequency curve at the abscissa X , always f/x)> f f-)c) or 
else always ior x > , 

I)Ut if the first regularity condition be satisfied, then for all 
intervals which include corresponding moments have 

upper limits in absolute value. And if this be so for all such’ 
intervals, the semi-invariants (of Thiele) will also have upper 
limits for their absolute values. If is the semi-invariant, 
we will take for our sccovd rcffuhrity condition on the tails of the 
frequency distribution of x . that ^ 

( 7 ) kiZ CA^-yU_^) 

for some h> o if is taken for any inteival which includes 
the arbitrarily large, though finite, interval 

If this second regularity condition holds, it is again easy to 
show that (5) is an equality if c< i. The right meml>er 



CECIL, a CRAIG 


101 


of (S) is still uniformly convergent in the interval 
^ \£\^ c < J . For all intervals which include use 
the formal identity which defines the semi-invariants of Thiele: 


(8) e 








Under the condition {7)y^(^J\s uniformly convergent over the 
intervals in question for and for these values of C , 

(8) becomes an equality since its second member is only the first 
arranged in powers of £ , Moreover, on account of (7) the right 
member must be uniformly convergent for all intervals which 
include 

At least one important class of frequency distributions satis- 
fies our second regularity condition. The distributions of charac- 
teristics in samples of A/ have finite ranges as long as A^is finite 
and they commonly have semi-invariants which are rapidly de- 
creasing with increasing /V . If such distributions approach nor- 
mality their semi-invariants of order above the second approach 
zero, in particular they may become in absolute value less than or 
equal to the corresponding semi -invariants of a Pearson Type 
III distribution which are given by 




k > Z 


in which 


cz ^ 


Z M 


, or 


» rk-dj! 

Taking to sec that such distributions satis f 

our second regularity condition. The smaller the skewness of th 
Type III distribution, the smaller h may be taken. Thus in sue 



102 


TCHBBYCHBP JNBQVAllTy OP BERNSTEIN 


cases we can give a lower limit for tcjy the probability 

that \36\< t<^ , which is improved with decreasing skewness cf 
the Type III distribution. By the use of the first regularity con- 
dition we could only take j as the distribution approaches 
normality. 

As a second application^ let us suppose that x ^ in 

which and are independent, and in which the semi-invari* 
ants of the distribution of are ^ * 7 v and the 

semi-mvariants of the distribution of are 
Then the distribution of pc has for semi-invariants 


^2" ’• 

Further let it be assumed that ^ < 1 , and that the distribu- 

tion of satisfies our second regularity condition. 

Then 

. PC^\<,ta)>P(^\>(\s id,) P(\)6^ itCef-a,)) 

But 

<^2 ) 


Now 


>l-2e 


cr/ 

"zTTh ijd-d,) 


<^2 




<3 -a, {'d,'** a/J ^ 

<3^ ' ^ I \io<K<i 


SO that we get 

p(\»2\ ^ t('d.d^))> 

This gives finally in such cases 


'/TJhI. 


P{’\x\i id)> P('\x),<td)-2e 



ON CORRELATION SURFACES OF SUMS WITH A 
CERTAIN NUMBER OF RANDOM ELEMENTS 
IN COMMONH* 

By 

Cari. H. Fi>schb:r 

Introduction, The study of correlation due to a common 
factor has been a more or less familiar one in the literature of 
mathematical statistics. Kapteyn,^ in an exposition of the Pear- 
sonian coefficient of correlation, considered the correlation be- 
tween two sums of normally distributed variables, the sums hav- 
ing H random elements in common. In 1920, Rietz* devised urn 
schemata which yield sums with common items involved in such 
a way that the correlation and regression properties can be dealt 
by a priori methods. In a later paper, Rietz® considered two vari- 
ables, each the sum of two random drawings of elements from a 
continuous rectangular distribution, with one of the elements in 
common. Here, the emphasis was placed principally upon the 
description of the correlation surface. Some other aspects and 
extensions of this problem were brought out by Kart Pearson* in 
an editorial discussion of Rietz^s paper. 

In the literature, the theory of correlation has been discussed 
principally in connection with its applications. One of the objects 
of some of the above-mentioned papers is the establishment of a 
closer connection between correlation theory and abstract prob- 
alibity theory. Such a connection would give a more precise 

*Prescnted to the American Mathematical Society, Dec. 28, 1931, 

C. Kaptcyn, “Definition of the Correlation-Coefficient,” Monthly 
Notices of the Royal Astronomical Society, Vol. 72 (1912), pp, 518-525, 
L. Rietz, “Urn Schemata as a Basis for the Development of Cor- 
relation Theory/' Annals of Mathematics, Vol. 21 (1920), pp. 306-322, 

*H. ly, Rictz, ”A Simple Non-Normal Correlation Surface/’ Bio- 
metrika, Vol. 24 (1932), pp. 288-291. 

^Kad Pearson, ^Trofessor Rietz’s Problem,” (Editorial), Biometrika, 
Vol. 24 (1932), pp, 290-291. 



104 


CORRELATION' SURPACBS OP SUMS 


tneaning to correlation and would tend to make the study of cor- 
relation theory more attractive to mathematicians. With this aim 
in view, the present paper is concerned with correlation among 
sums having common elements, extending and generalizing the 
preceding papers in several ways. 

We shall assume our drawings made from a continuous uni- 
verse characterized by a rather arbitrary law of distribution. We 
shall define r? sums, each of an arbitrary number of elements, 
formed in such a manner that any two consecutive sums have 
elements in common, and inquire into the correlation between any 
two of these sums. The equations of the correlation surfaces 
will be expressed in terms of iterated integrals, the regression of 
each variable on the other will be shown to be linear, and the 
equations of the regression lines will be obtained. The coefficient 
of correlation may then be computed from the slopes of these lines. 

Throughout this paper we shall understand a probability 
function, to be, for all values of if on a range a single- 
valued, real-valued, non-negative, continuous function of / . It 
is then Riemann integrable on , and we shall require that 

. We define the probability that a value of t , 
drawn at random from the range P , lie in the interval 
a and ^ in ^ and b>a^ to be / f(t)ctt . We may then say 

•'<7 

that to within infinitesimals of higher order, the prob- 

ability that a value of t drawn at random lies in ^the interval 
(t^ t Bachelier“ has classified probabilities into those of 

the first, second, and third kinds, and Craig® has extended this to 
probability functions, according as ^ is the range 
and fO, aji respectively. We shall find it convenient to adopt 
this classification. 

»t. Bachelitr, *'Cakul des Probabilities/^ (1912), p. 155. 

®AUen T, Cratg, “On the Distribution Certain Statistics/' American 
Journal of Mathematics, VoL 54 (1932), pp. 3S3-366. 



CARl H. PISCHBR 


105 


L Sums of elements drawn from a universe characterized 
by a probability function of the first kind. 

1. The correlation between two sums having random ele- 
ments m common. Let a probability function of the first 
kind, characterize the distribution of the variable t . Let the 
principal variable be defined as the sum of 77/ independent 
values of t drawn at random. Further, let the principal variable 
be defined as the sum of random values of the values 
of / composing X, and of independent random values of 

t taken directly from the universe characterized by 0- 

Theorem L Given the sums Xf and X^ as defined above, 
mth random elements in common, 

a) The marginal distributions of X, and x^are given^ respec- 


tively t by 




and 


oo 


(1.12) 




-f,. -t 






b) The correlation surface, or the simulta- 

neous law of distribution of x, and x^, is given by 

c) The regression curves of on and of x, on x^ are 
linear, and are given, respectively, by the following equations i 

(Ul) 



m 


CORRBLATFON SURPACBS OP SUMS: 


( 1 . 32 ) 


7vhcre 






Hence, the coefficient of correlation hefzveen )Cf and is 

n 

Proof, The proof for the expressions for the marginal dis- 
tributions of Xf and are given by Craig^ and need not be 
repeated here. The correlation surface derived 

by a simple extension of the same method to two independent 
variables. 

The regression curve of on x, is the locus of the ordinate 
of the centroid of a section of the surface for any given . 
Thus 

( 1 . 4 ) ■ 

It will be convenient in what follows to use an abbreviated nota- 
tion by letting 

( 1 . 5 ) ^ ~ / 7 ^- / 

which is merely the integrand of the marginal distribution of . 
Where no ambiguity can result, will be used in place of 

0{^X^, ^/// ' * * \ Then may be written 

^ r,^-i V/ »,-/ • • • • 

Now let Changing the variable 

’Allen T. Craig, loc. cit, pp, 355-356, 



CARL H. PISCHBR 


1 € 


from X'Ao (1.4) becomes 




/\fl 







It will be noted that the terms in the numerator fall into two 
groups : those terms containing the factors tj^ , 1^2^ ' ' * 

and those terms containing the factors v ov 
Further, since the order of integration here is immaterial, the 
equality of the kf^ integrals of the first group follows readily. 
Similarly, the equality of the 7?^^ « integrals of the second 
group follows. The expression (T6) may then be written 








In (1.7), it is clear that the integrations with respect to each tj^j 
may be effected immediately, making use of 
In the first term of the numerator and in the denominator the 
variable may likewise be integrated out. The denominator is 
now equal to (1.11), the marginal distribution function of >cJ, , In 
the second term of the numerator, v- f^v^)is independent of the 
remaining factors, and is a constant which we shall 

denote by Af. This second term of the numerator is now equal 
Xq k^j^Jkl times the marginal distribution function of )C^ 



1(» 


CORRELATION SURPACES OF SUMS 


Hence, we have now reduced the expression (1.7) for to the 
following form: 


( 1 . 8 ) 




where /„ * J-IS - J LzB. 


4/ 


/. • ■/ Z),, • • • 'f-/ ' 

*« ‘^-(O 

To evaluate let z)/ ^ ^4 '4-7 • 

Then a, „ 

’ " ’ ' ^4 ' 4 -'’ ' “ 

I-v ‘ i/C »,•/ n,-/'" 

f.t» ^ ^ ^^4 '?■<'' " 

r r ^4 ^ nr/ ' ' ' 

" 0 > *'■00 ^ 

y h’ljo' 

^ nr/^*^^!, rt,-/"' 

The first term in the above expression for 7^^ is equal to . 
Each of the remaining n^-1 terms is equal to . Hence 


and 




Prom (1.8) and (1*9), we have 


In exactly the same manner, we may show that 





CARL H. PISCHBR 


10^ 


Making use of the fact that in the case of linear regression the 
square of the correlation coefficient is equal to the product of the 
slopes of the two lines of regression, we obtain 

which completes the proof of the theorem. 

Corollary, li x and y are each the sum of n independent 
random values of a variable i from a universe characterized by 
f/'ij , and have k of these values in common, the coefficient of 
correlation between and y is equal to the ratio of the number 
of values of ^ held in common to the total number com|3ostng 
each principal variable. Thus, r^y * ^ . 

This corollary of Theorem I was proved by Kapteyn® for 
the special case of a normal parent distribution of the variable i . 

Illustration. As a simple illustration of the application of 
the foregoing theorem, let us consider the case where 

^ ^ with il,, , , as 

independent random drawings of / from the Gaussian distribu- 
tion, 


From (I.!!), the marginal distribution of is 

G,rx,)~r4rt/^ e-% 

Similarly, the marginal di.stribution of n- i.s 

-I ^ 

G^fx^)‘r4rTj 

The correlation surface. , obtained by applying 

Ffx,.x.). ® 


(x rr ii) 


a normal correlation surface with ^ ^ * 

2. The correlation among three sums. We now proceed to 
extend the preceding theorem to more than two sums. Let us 
define a third sum, or principal variable, Xj , as the sum of 


C. Kaptcyn, Joe. cit 



110 


CORRBLATIOM SURFACES OF SUMS 


elements taken at random from the 7^ values of / composing 
plus the sum of independent random values of t drawn 

from the parent population. It is apparent, then, that the mar- 
ginal distributions of , and , and the correlation sur- 
faces and be formed exactly as were 

those of and x:^ in Theorem I, From this theorem, we are 
at once in a position to write the equations of the lines of regres- 
sion and the coefficients of correlation for these surfaces. The 
surface m/* - remains to be investigated, as does the 

four-dimensional surface, v^)^f , which may be ob- 
tained in almost the same manner. 

Theorem II, Given ff^Jand , as defined above. 

Let ^ . be defined as in (1.5). Let 






'■13 




If a probability function of the first kindj then the eX'^ 

pression for the simtUtaneous distribution of and is 


( 2 . 1 ) 









■■■ 



where by is understood the number of combinnHons of c 
items taken ^ at a time. 

Proof, Let us temporarily require that ^ /r . We 
shall show later that this restriction may be removed. The prob- 
ability that X, and as defined contain Q 


elements in common is 





CARL H, FISCHER 


m 

The probability of the occurrence of any given pair of values 
that is, the probability of a point falling into a given 
rectangle, the sum of the prob- 

abilities of all of the mutually exclusive ways in which it can 
' occur. Each of the terms in (2.1) multiplied by consists 

of the integral, (derived by the method of Theorem I), which is 
the probability, to within infinitesimals of higher order, of the 
occurrence of a given pair, f ) , with a specified number 
of values of t in common, times a coefficient which is equal to 
the probability of the occurrence of this specified number of 
values of / in common. Each of the terms as a whole, then, is 
the probability that the given occur with a specified 

number of values of t in common. Hence, the expression (2.1), 
being the sum of the probabilities of all of the mutually exclusive 
ways in which and can fall within the desired rectangle, 
is the probability that this will occur. This establishes the theorem 
when 

If Kjf ^ y then the maximum number of values of t 
which and can have in common is The expression 
for F ("Xf ^ x^ ) in this case, then, consists of the sum of all of 
the terms of (2.1) beginning with the term where and 
have values of if in common and continuing to include the 
term derived from the case where they have no values of / in 
common. Equation (2.1), however, in its present form may be 
considered as a correct formal expression for the correlation sur- 
face even when » since in this case all of the coefficients 

of the terms where x, and x^ are to have more than k^j^ values 
of { in common are zero. This follows from the definition 
O \i c < d ^ Thus 



Hence, we may now remove the restriction that - k ^3 • This 
establishes the ^theorem. 



112 


CORRELATION SURFACES OF SUMS 


We are now in a position to write down the surface 


It is given by the following expression, where, by 
is meant any ^ values of the : 





4 


Theorem IIL The regression curves of on and of 
on for the correlation surface > defined in 

Theorem II, are linear and are given, respectively, by the follovy 
ing equations : 


( 2 , 21 ) 

and 

( 2 . 22 ) 




^S' 










^ ^ 


where M w defied as in Theorem L further, the coefficient of 
correlation between and is 

(2.3) 






Proof, As in the proof of Theorem I, we set up the expres- 
sion for the locus of the ordinate of the centroid of a section of 
the surface for a fixed Xy . We have 

ie 

^ ‘ jy. fi:rx..Xy JctXj 

where is given by (2.1). From the definition of a 



113 


CARL H. FISCHER 

marginal distribution, we know that J ^ )dX^ reduces 

to (1.11), the marginal distribution of x^ . Let us now write the 
expression for Xj as the sum of fractions. Thus 


(2.4) 






Hereafter, we shall call an expression of the form 



a “probability coefficient.” Then (2.4) is the sum of products, 
each of which is a probability coefficient times an expression 
which is equivalent to the expression for for the simple case 

where would be derived directly from by the drawing of 

values of i from jc,. These latter expressions, by the * 
application of Theorem I, may each be written in the same form 
as (1.3). Hence, (2.4) has been reduced to 



/ 



114 


CORRELATION SURFACES OP SUMS 






/ I ^ 

By the use of a well-known theorem of combinatory analysis,® we 
have that 

and 





have that j 

and /; / 1 // t 


Moreover, 







which reduces to 


Ae -4^ ^ / j/fkr i 


by the same theorem of combinatory analysis. 
Hence, (2.5) becomes 

^4g ^JiS 






^Jt 

In exactly the same manner, we may show that 


AJ,a 


k/z kz$ •*?# 




We then obtain the coefficient of correlation from the slopes of 
these lines. It is 

This completes the proof of the theorem, since 


3. The correlation among p sums. We now extend our 
discussion to p principal variables, forming each successive one 


»E. Netto, *'l>hrbtjch dcr Combiimtortk,'^ (1901), pp, 12-13. 



CARL H, PISCHBR 


ns 


in the same manner in which and were formed above; that 
is, pj , is equal to the sum of i random 

drawings of i from the constituent values of i forming , 
plus the sum of / independent random drawings of / 

directly from the universe characterized by The correlation 
surface, wa written in the same 

manner as the surface considered in Theorem II. That is, each 
term of the expression for ^ multiplied by 

consists of an iterated integral which represents the probability, 
to within infinitesimals of higher order, of the occurrence of a 
given pair, with a specified number of values of t in 

common, times a probability coefficient which represents the prob- 
ability of the occurrence of this specified number of values of / 
in common. This same method may be employed in writing the 
correlation surface for any pair of principal variables. The ex- 
pressions for the probability coefficients, however, become increas- 
ingly complex as the number of ways in which the two principal 
variables can have O, values of in common increases, 

The following theorem can be proved by mathematical in- 
duction. The proof is not difficult, though tedious, and on that 
account will not be presented here. 

Theorem IV. // ffijis a probability function of the first 
kind, and J is the simultaneous law of distribution of 

Zf and then the regression of on and of on are 
linear and are given, respectively, by the follotinng equations: 

( 3 . 1 ) 

(3.2) Z, « ^ n, • /y/ - 

Further, the coefficient of correlation between and Zp is 



116 


CORRELATION SURFACES OF SUMS 


II. Sums of elements drawn from a universe characterized 
by a probability function of the second kind. 

4. The correlations between two sums. Let a i)rol> 

ability function of the second kind, characterize the distribution 
of the variable ^ . Let the principal variable Xj be defined as 
the sum of independent values of f drawn at random. Fur- 
ther, let the principal variable be defined as the sum of 
random values of the values of / composing x, and of 
independent random values of jf taken directly from the universe 
characterized by ff 6! 

Theorem V. Given the sums X, and as defined above 
tvith ^ random elements in common. 

a) The marginal distributions of x, and x^ are given, re- 
spectively, by i 

/ X, ^Xr 4 / 4 ^ 

J / ^ frtj-frt, 

''o ^ 

" 4/ r9,-f “ ’ • <^4/ / 

and 




■tf"T 






b) The correlation surface, 7vhkh is in '{wo 

distinct farts joined along the plane - x. ~ O , is given bv 

(4.2a) 



, 





^ 1 ^// rt,-/ ) 



CARL H. FISCHBR 


Ilf 


^ *' ^^*4* 


{" )Cj^ ^ < ^ ) i 


(42b) 


/-^/'4/ f rf,-^ 44t^z 


- ✓ 1 f^r^r*^4 f'^/ V/ -^rt,-4rr 

'5'^..V'/7 •/ I 


I 


’0 ^O 

• 3^r/-" tffk- y 




j'^S *'r-*ik,l ^ 


^ ■•■ *^ It,-/ ) ^(''^je‘ 4-j '"^/k,g ‘ '"%'i'^^ 

(^X,SX^<co). 


.••• <*'4 




c) TA^ regression curves of on and of jc^ on jc^ 
are linear and are given, respectively, by the following equations: 


(1.31) 

n, ■>■ T'h- ^ ^ ' 

and 


(1.32) 


where 

M- i fftjdi. 


Hence, the cofficient of correlation between and ^ is 

^ rr7, ni)i ' 

Proof, The proof for the marginal distributions of and 
of are^ given by Craig^® and need not be repeated here. The 
expressions tor the correlation surface are derived by a simple 
extension of the same method to two independent variables. The 


^<>AI!en T. Craig, loc. cit, p. 356. 



118 


CORRELATION SURFACES OP SUMS 


limits of integration may be easily verified. 

As in the proof of Theorem I, the regression of on 
is given by the locus of the ordinate of the centroid of the section 
of the surface for a given Xf . However, as the surface here is 
in two distinct, but connected, parts, we have two terms in both 
numerator and denominator. The expression for is 

/f f'x,. Xi )dXf ’‘Z 

(4.3) 

/ 'p;fxj,x^)dx^ I^^J(:/,x^ )dXj 

where are defined by (4,2a^ and 

(4.2b), respectively. 

In the paragraphs immediately following, we shall be con- 
cern^ principally with interchanging the order of integration, 
with the accompanying changes in the limits. It will be convenient 
to write the differential immediately following its respective 
integral sign. Consider the first term of the numerator. Sue- 
cessive interchanging the order of integration between inte- 
gration with respect to x^ and with respect to ^/ » 
respectively, and making tjie appropriate changes in the limits, we 
get, writing ^ for /Z x^, 2 *,, • • X 




fAAS 


00 ~o 












Now consider the second term of the numerator of (4.3). As 
he limits are constants with respect to the variables of integration 



CARL H. PISCHBR 


119 


, . . . . , we may interchange the order of integration 

successively until we have 
(4.5) 






r* f 

i 

rx.;e-*,r —iik,£ % ~ »&'-? 

We may now combine the first and second terwSy (4.4) and (4.5), 
getting 

^ -i- rnjm ^ d 

/ 


/ 


•■*7-4/-" 











^<e. 


V'" 





As the limits of integration are constant with respect to the 
variables ^ and interchange 

successively the orders of integration with respect to ^ and with 

respect to 4^ -5?^ ^ / 7 ^-/ > respectively, making 

the proper changes in the limits. We then have 



120 


CORRELATION SURFACES- OF SUMS 


The denominator of (4.3) may be reduced to this same form 
except for the absence of the factor a!j in the integrand. 

I^et us make the transformation ' 




as was done in the proof of Theorem I. The limits t/f*"'* 
to oo on now become C? to oe on V . We have now re- 
duced (4.3) to the following form: 

(4.7) 

W iH- 



The denominator reduces at once to (4.11). As in the 

proof of Theorem I directly following equation (1.6), it will be 
noted that the terms of the numerator fall into two groups : those 

terms containing the factor 4/ ^ 

the terms containing the factor V or^y , 

As the limits of integration with respect to each of these , letter 
variables are 0 and oo , and since complete interchangeability of 
the order of integration is then permissible, it is readily seen that 
any two of these terms are equivalent. The sum of the 

entire group, then, may be written 



CARL H, FISCHBR 


121 


In (4.8), it is clear that the integrations with respect to each 
may be effected immediately by making use of the hypothesis 

/ Off 

. This leaves 6^^, remaining as 

the integrand. The / i/ is a constant which we shall 

designate by /V. Removing this constant from under the integral 
signs leaves us merely tne expression for the marginal distribu- 
tion of times We then have 

/X . {Xt„. a^Ji' 




That each term in the summation in the right member of 
(4,9) is equal to any other term in the summation, follows from 
the complete interchangeability of the order of integration of any 
two consecutive variables, provided a corresponding interchange 
between these two variables is likewise carried out in the limits of 
integration. By successive interchanges of variables we may put 

the original 4/ » any order we choose. Hence, 

the sum of the last terms of (4.9) may be written as kf^ 
times any one of them. For definiteness, select the one containing 
the factor in the integrand of the numerator. We may now 
integrate out all of the and the \/ exactly 

as before. Equation (4,9) then becomes 


^ kf^ 


m: 










ex, 


It is not difficult to show that 






Hence, we have 


In exactly the same manner, we may show that 


k/^ ^ 





122 


CORRBLATfO.^ SURFACES OE Sl'MS 


The coefficient of correlation between and is 

which completes the prot)f of the theorem, 

Illusiration, Consider the two sums, ^ , and 

“ ^// ^// j random drawings 

of ^ from the distribution characterized by the function 
for ^ on the range to oo . From (4rii), the marginal distri- 
button of Xf is 

fxf ) s: 

Similarly, the marginal distribution of is 

{'x^J ~ JC^ e 

The correlation surface, obtained by applying (4.2a) and (4.2b), 
is 

e '*"^1- e Coi X, ); 

and 

5. The correlation among more than two sums. We shall 
state, without proof, the following theorems. 

Theorem VI. Given a probability function^ ffi), of the 
second kind, and three principal ZHirmbles, Xf , , defined 

as for^ Theorem I!, Then the correlation surface Xj) 

is given by 


(5.1a) 




CARL H. PlSCHBR 


123 




^ < a>Jj 


and 

(S.lb) 




/ 

/ 

/ 


Vi- 




‘•^' // 4 -ir -a-rf 






5>^-/ 








e(t 

4^y 


Theorem VIJ rA. 

n vii. ^ he regression curves nf ^ 

of the correlation surface il n ' 

««rf respectively by the fall • 

(2.21) \^y*^Jolloimng equations: 

and ^ ^ -» 


( 2 . 22 ) 



124 


CORRHLATIOM SVRfACBS OP SimS 


where M is defined as in Theorem V. Farther, the coefficient 
of correlation between Xf and is 

kijg ^ ^ ^ 


( 2 . 3 ) 

Theorem VIII. The statement of this theorem differs from 
that of Theorem IV only in that »ow to be a probability 

function of the second kind. 

' III. Sums of elements drawn from a universe characterized 
by a probability function of the third kind. 

6. The correlation between two sums. We shall now con- 
sider principal variables defined as the sums of values of t drawn 
from a universe characterized by a probability function of 
the third kind, defined on the range O to a . and with 

The correlation surfaces are not developed with the same degree 
of generality as were those in the preceding pages because of the 
tediousness of the labor involved and the complexity of the cort 
relation surface, which may consist of many sections joined to- 
gether, Thus, if AT is the sum of m values of i and y the sum 
of n , all drawn from a universe characterized by a probability 
function of the third kind, the correlation surface, w* 
consists of ^rj-^sections, each having its own equation. Hence, 
only the^case where x and y*each consist of the sum of two 
values of i , with one of these held in common, will be considered 
here. 


Theorem IX, Let a fn^obalniity function of the third 
kind, characterise the distribution of a variable i . Let the prin- 
cipal variables x and y be defined by the relations , 

y- +■ , where 4 / ^ independent random 

drawings of i from the universe^ 

a.) The marginal distributions of x and of y arc given by 



CARL H. FISCHER 


125 


(6.11) 

^x) = j f^^Jf^X-Odt 

fO ^ xi a); 


ra 

(a 'S X = 2a)\ 

and 

(6.12) 

G^fyJ 

{O^y^ a); 


-/ fCtJff'y-iJdt, 

fa % y 2a). 


b) The correlation surface, , is given by 


( 6 . 2 ) 


ftx- 1) ffy^ ijdt, {O iy §-K £a^; 

^ rX. 

» / x-t) f(y~i)di-, (O^x sysa); 

Jo 

« / fCi) fCx~t) f Cy- i)d^, { a%y ix*c2S^aJ; 

"y-ct 


' f fCt) ffx- f) f fy- i) dt, ( OlkX-a%y s a) ; 
Jy-a 

•f f(i)ffx-t)f(y.i)dt, ( a%iy 2a)\ 

Jx-a 

■ j J(^)f C^-^) f(y-i)Jir. fa%Kty% 2a). 


In a) and b) above, the subscripts have been omitted from the 
c) The regression curves of y on K and bf x on y are 
linear and are given, respectively, hy the following equations: 
( 6 .. 11 ) 

(6.32) and 

where A/« / ^ 

Hence, the coefficient of correlation between x and y is ^ , 



126 


CORRELATION SURFACES OF SUMS 


This theorem is a direct generalization of Rietz's paper in 
Biometrika cited in the introduction to this paper. The proof may 
be supplied by the reader. 

Illustration. Let us consider the rectangular distribution 
given by for r* on the range O to a , and a toO. 

This is the parent distribution in Rietz's case when <7 » / . Prom 
(6,11), the marginal distribution of a? is 

<3-/ (^<9 I a? S £2r/- 




(a %. X 


2a). 


Similarly, the marginal distribution of y is 


- 

a* ' 

The application of (6.2) yields 

y 




a 

(x-yi-a) 

, j — , 

(y-Xi-a) 

’ 5^' 

(2a- x) 

‘ 5T~’ 

r^a-yj 


(Ot y% a); 
(a £ 2a) 

(OiyixSa); 

{ (PSxiyS a); 

4 

( atyix*ai2a); 

( Oix-myk a); 

( aiy£ xt 2a); 

( a%x% yg 2a). 


These results, obtained directly by the use of Theorem .DC, agree 
with those obtained by Rietz in the above-mentioned paper. 



ON THE CORRELATION BETWEEN CERTAIN 
AVERAGES FROM SMALL SAMPLES* 

By 

Aubn T. Craig 

1. Introduction. It is well known that no correlation exists 
between the arithmetic mean and standard deviation of samples 
drawn at random from a normal universe. However, there seems 
to be in the literature no treatment of the correlation between 
other averages either for normal or non-normal universes. In the 
present paper, a few simple theorems are established which make 
possible the determination of the type of regression of the median 
on the arithmetic mean, of the range on the median, and of the 
range on the arithmetic mean. In case the regression is linear, 
the coefficient of correlation may be computed. 

We shall understand a probability function f(^) of a real 
variable x to be, for all values of x on a range of /P a single- 
valued, non-negative, continuous function with 
Then / fC>^)dx is the probability that a value of aj chosen 

at random lies in the interval (a,b) where a and b are in 
and a< b \ and fCx) dx is, to within infinitesimals of higher 
order, the probability that a value of x chosen at random lies 
in the interval fx, x-i-dx). It will prove convenient to classify 
probability functions according as R is the range (^oo^ oo), f 0, <x), 
or k), k>0. In accord with this classification,^ we shall 
refer to probability functions as of the first, shcond, and third 
kinds respectively. In a similar manner, we define a probability 
function F (%y) of two independent variables. 

♦Presented to the American Mathematical Society, Dec, 28, 1931. 

>Cf. L. Bachelier, Calcul des Probabilit^s, p. ISS. 




128 CORRBIATION BBTWBBN CERTAIN AVERAGES 

2, The correlation between the arithmetic mean and the 
range W. 

Theorem L Let fOc) he the probability function of the vari- 
able X . Let /^{k.Wjhe that of the arithmetic mean x and the 
range IV in samples of three independent values of x , If ffx) 
is a probahiliiy function of the first kind^ then 

Fa. wj.mf ^ fa,) ta,- wj fr^2-ex, ^ w)dx . 

Proof. Let ^ * he the three observed values of x . 

Write 


For 5 assigned, -co<x<co ^ and IV" assigned, ^ V/<oc wc 

must have 


£ ^ 








If we consider all possible arrangements of , we have 

Ffx, WJdx dW^df ffx ) ) f(^x )dx. dx. . 

4vHr ' ^ ^ ^ ^ 

Let 

I 


The absolute value of the Jacobin is J. Hence the theorem. 



AILUN T, CRAIG 


129 


In the case of samples of four independent items 
the probability function given by 


F;fx^ Wj.4d/ / A 




■4x-SX.,+2V 

‘ 48/ / ) ff4z-^2:i - >* W)f/x, - W^c/x^ < 3 ^ . 

, 1?^ T\r 


''X.j 


We note that the probability function is made up of the sum of 
two parts depending on whether is in the interval ( ^ , 

in the interval (Z-/’ Moreover, it may be of 

interest to note the overlapping of the ranges of integration of 
* To prove that given as stated, we take 

(1) 


From (1) it readily follows that 

( 2 ) JSx^ - 4x ^ W, 

For assigned values of x and W, the upper limit on is found 
from ( 2 ) by taking ^ Thus . 

Similarly, the' lower limit on is found from (2) by taking 

^ X ^ . But may not always 

be as large as for all values of - This may be seen by taking 
aud x^-lV in ( 2 ). This leads to 

Thus, for X X ^ , we see that x, is the upper limit 

on To determine the lower limit on for this region of 
variation of , we select x^ as near x^^x^-'W as is possible 
without causing to exceed , But Vl^ . 

At most, then 4x-^x^-x^’/- x, or 4x- Jx^ 

Thus we have established the limits of integration used in the 
first part of the sum of which Wj consists* A similar argu- 
ment shows if ^ < Xy £ X , that 


X^^V/s Xj^ s 4Ii ^3W: 



130 CORRBIATION BBTJVBBN CBRTAIN AVBRAGBS 


If f(^x) is a probability function of the second kind, we 
observe in samples of three independent items t ^ i for 2 

assigned, that OiWsSx. If XV ^ we have 

sr i- W, 


and if 


< 'W s 3x> , we have 

XV s x.s Si + 


Accordingly, 


Xj^ K 3x « -hWl 
^ » AS, - WT 




F/Si, Wj^iai ^ WMx. , O^W:^ 




■‘i8f"^~^f('x,)ffx,-W)ff3x-2x^+Wjdx,. 'fiW£3Si: 

"V 

In samples of four independent items ^ , drawn 

from a universe characterised by a law of probability of this kind, 
we find 

f2, Wj«4S I ^ / ffx^)f/x^)f^4x ^ Wjdx/3^x^ 

X¥‘ “tSX^ ^ 

- jgy. 

^43/ / f^x,)ftx^ )ff42 -2xj -X ^ iVjffx,- Wjdi!x^ c/x, , 




0£W* ^ 


*46/ / f('x,)f{'xjmxVx^-x^^W)frx,-W'Mx^dx, 

'V '^4X-3Xf4-W' 

'■^Z / ff>‘i)Vx.)fr4Z-^^-^,*-iVjVx,-Vi2j dx.dx,. 


/ Vx,) f^XjJf^4x-2x, -x^+Wjf^x^.Wjdx^ o'^^, 

-^^5 W/"S 


^SWiZX, 


V Jt^ 



ALLBN r. CRAIG 


131 


Finally, consider f/irj to be a probability function of the 
third kind. In samples of three independent items p » 

for O < X £ k/'S t we obtain O <\I/'<3x ; iovk/3^ Xi3kf3^ 
we obtain O < W:s /r ; for ^ ^ k , we obtain 

Os IVs of k-^X It is fairly easy to see that for x and 
assigned as indicated, the following regions of selection of are 
valid : 

for O s X s k/^ and Os W ^ Jx/2, 

or for kf2 < 2 s k and O sWs 3{^k- ) / 2 # then 

x^W/Js x^s 2 ^ 2^/3 ) 

for Osxsk/S and 

or for kfj s 2 £ k/2 and 3 2/2 s IV " s 3Xk-'2X/2^ then 

W£ £ 2 ^2W/3; 

for 2k/ 3 £ 2 £ k and 3(^k-2X/ 2 s 

or for k/2 s 2 £ 2k/3 and 30 k-2j/ 2 s IVs 32/2 , then 
^2 ^ lV/3 s k ; 

for k/j s 2 s k/2 and Of' k-2j/2 s TVs k, 

or for k/2 s2<2k/3 and 32/2 s H/is k , then < 4*. 

Thus, 

- 

s 

f;(^x,wj^ 33/ /fx J ffxr Vi^X ff3x^2x,^ WXdx . . 

3 

-13/ ffx^)ffx^^WXff32-2x^^WXdx^, 
kv 

*18/ f/x^ ) ffx^ - WXff32-2x^ ^ lVXc3x, , 
rte 

- Idj^ - W) ^ Wja'x„ 

over those regions of the -^IT-plane indicated above. 

In case of samples of four independent items j ^ 

, drawn from a universe characterized by a orohaKitu^ x-— 



, 

“' •‘-»'-p’»= po™.,, ,, '»"»- ‘^e „«.; 

<p 

(A) U/=-^ 

(B) |v=^^ 
lw=^ 

j^-2Jc 

(C) j W's 



v= ^ 

(D) 

Vi/= ^OlrJd 


J^urther, let 


fW ^ ^ 

(LC) V/ = ^/-A-.^^ 
IW « 
fW/= ^ 
(F)IV 

[V= ^ 
(C)|^.^ 

T 

(^vjw = 

'w --= 


and let 


0 fr' 




a 

'■ i> th» „„, aisi„„ „ 


^ c/ 

a cj 


fi;a,wj.4eV 


L[ w 


''T^ *' ) fx*-^ 




A , (A) 




=<^1 


A 7^ 


x,-W 


:^i( 




4 




(C) 






I CD) 



'>4d 


•48 


AILUN T. CRAIG 


133 

t 1 

) 

& . 

(E) 

1 — 1 

)' 

4Z-3x,i-'Wj . 

]" • 

(F) 

t t 

44-Jx, i-^WX 
xrW ). 

]^- 

(G) 


J 

4x-3x^-hWj , 

]- 

(H) 


-43 


As illustrations of these theorems, let us find the correlation 

\7 ' 

between the range and the mean for universes of specified types. 

O s X < oo. 


Example 1. Let 

For samples of three items, we have 
, FjXfWj. 6We'^^, 


0<Ws 'Y. 


The distributions of the marginal totals of W and » are obtained 
by integrating H^with regard to x and W respectively. We 
readily find 




-I -sa 


Oi 0c , 


and 


)lffWj*2e ^ 


as previously given by jthe writer.* For 2 assigned, the mean 
of the array of W" is V/^ « ^ . Thus the regression of 

- i/72? ^ 


on ^ is linear and r * ^ 
3 


sAmerican Journal of Mathematics, Vol 54 (1932), pp. 359, 366. 



134 CORRELATION BETWEEN CERTAIN AVERAGES 

Example 2. Let, ^ k. 

For samples of three items, we have 




6W 

-js> 


la 

•F" 





■f.ri'-w) 


over those regions of theilF-plane indicated above. The mar* 
ginal totals' are distributed in accord with 




il^ 




9 

'IP 




']■ 





and ^ ^ _ 

We readily find 


H. L. Rietz, On a Certain I^aw of Probability of lAphct, Proc, 
Int. Math. Congress, Toronto (1924), pp. 79S-799. 

J. O. Irwin, On the Frwuency Distributions of Meaos, etc., Bio- 
metnka, Vol 19 (1927), pp. 225-239. 

P. Hall, The Distribution of Means for Samples of Size N, Bio* 
metrika, Vol. 19 (1927), pp. 240-245. 

_ J. Neyman and E. S. Pearson, On the Use and Distribution of Certain 
Test Criteria, Biomctrika. Vol. 20 (1928), p. 210. 



135 


ALLEN T. CRAIG 

- !l<^< 

dk'^-Jdkx i- 36x^ ' J “ J ' 


Thus the regression curve of on ^ is continuous, but the 
regression is non-linear for ^ ^ x < -y- 

3. The correlation between the arithmetic mean x and the 
median ^ . 

Theorem II. L^t f(xj he the probability function of ike 
variable x . Let fjhe that of the arithmetic mean x and 
the median ^ in samples of three independent values of x . If 
ffx) is a probability function of the first kind, then 


^3 fl' 3 k - )dx^ , sZ, 




frx^j frJx-r-K>,)dx, , 

Proof, l^et Xy > ^ f Xj , he the three observed values of x . 
Write 


’‘'*^5 " 3x, 

For X and ^ assigned, ^ < x , we must have 
3X'- 3^ ^ X^ < oa 
-ie-r 

Xy = Jx- ^-x^, 


f s x^ < oo 
x^~^f 

Xj =r 3x - x ^ . 


and for x £ 



13 « CORRBLATION BBTWBBN CBRTAIN AVBRAGBS 


If we consider all possible arrangements of , js, , we have 

f^Xf) ffx^) dx, dXj, fi 2, 


The change of variable establishes the theorem. 

In case of samples of five independent items ^ * 

^ , the probability function is given by 



This follows immediately from the fact that for x and f as- 
signed, ^ ^ , we may have either 

Sx -3f £ X^KOO, 

dx X^S X^i^, 

Jx-f 'X,-x^-x^, 

or 

dz - 4-^ £ X^<oo^ 
fs -fcv, 

* 



ALL.HN T. CRAIG 


137 


and for x< ^ , we must have 

^ ^ Xf <oo 

00, 

^3’? 

x^.Sx^-f-x^-x^- X^. 

If -ffx) is a probability function of the second kind, it is clear 
that 0< i ^ in samples of three items. Then 







0<^<2, 


! 


x<^s 


2 ■ 


In case of samples of five independent items drawn at ran- 
dom from a universe characterized by a probability function of 
the second kind, ca.n best be expressed in a form employ- 

ing the notation used previously. Thus we write 
/« x^-x^- x,^), 


and 


Then 


s Jx-i^ - x^-x^ 


~Xj, 




Iff 

ct c e 


^ a! x^atx^ afx^ 


i> d f 


ace 




L\^ 


U 


SO 


^20 








f-f- 


SI 






^21 


U 


iz 

o 


^JL2 




O ^ 2, 




/ 



138 CORRHIATION BUTWHEN CERTAIN AyBRAGBS 



Finally, consider ffxj to be a probability function of the 
third kind. In samples of three independent items, for {7^ Stsk/j, 
we obtain 0< ^ s ; for k/^ s ^ ^ £k/3 , we obtain 

\ for £k/3 ^ k , we obtain A; 

It is not difficult to verify for x and f assigned as indicated, 
the following regions of selection of ASy are valid: 
for On 2 ^ k/j and On ^ 
or for k/^ n n kl2 and O 2- k n fn JB , then 

for kfjn 2 < k/2 and {^32- kj/ 2 £ ^ n 32- k, 
or for k/2 s 2 n k and (^32-kj/ 2 n ^ n 2 ^ then 
32*2 f k I 

for O n 2 s kf2 and 2 n ^ n 32/2, 

or for kl2 £ 2 n 2k/ 3 and 32- k £ 2* ^ 32/2 , then 

^ £ X, n 32 - ^ ; 

for k/2 ^ 2 n 2k/ 3 and 2 s if £ 32- k, 

or ior 2kl3 < 2 n k and 2 n f n k ^ then ^ k. 


Thus 




• /<9 ff'O / dx, , 





AtmN T. CRAIG 


139 


•isfreJj 

T 

^j<9 ff^)f frx,)ffjx-r- . 

T 

over those regions of the -plane as indicated above. 

With samples of five items, the correlation surface is defined 
in so many parts that we shall not take the space necessary to 
consider it. 

As illustrations of these theorems, we shall find the correla- 
tion between the median and the mean for universes of specified 
types. 

Example 1. Let e x < po. 

For samples of three items, we have 

The distribution function of the marginal totals of ^ is given by* 

For X assigned, the mean of the array of ^ is 

" 6 

Thus the regression of on iS is linear and 
Example 2, Let ^0^}^ ^ ^ /r. 

For samples of three items, we have 



*Cf. American Journal of Mathematics, Vol. 54 (1932), p. 364. 



140 CORRELATION BETWEEN CERTAIN AVERAGES 


16 

k 






over those regions of the 5 ? -plane indicated above. The distri- 
bution function of the marginal totals of ^ is given by® 


We find 


6 ' 


Os j’ 


6x 


^ <X-C 

H. ^ ^ ^ 


Thus the regression curve of ^ on a; is continuous but the re- 
gression is non-linear for j i 2 < -j-.. 

4, The correlation between the median ^ and the range W, 
Theorem IIL Let f(x) he the probability function of the 
variable z , Let /^/^I0be that of the median ^ and the range 
in samples of independent values of X . If is a 

probability function of the first kind, then 

/J/ T 


Proof. We have 


1 


'/ 77^1 


r - 

m'j. 

r/^ 1 

m)dt 


/ rriM^ 

\Af J 


J 




£ AS 


> 




^r. 


®Cf. P. R. Rider, On the Distribution of the Ratio of Mean to Stand- 
ard Deviation, etc., Biomctrika, Vol. 21 (1929), pp. 136-137. 



ALLBM T. CRAIG 


141 


Hence the theorem. 

If ffzj IS a probability function of the second kind, then 


\ffmi 


m-J 


, Finally,' consider to be a probability function of the 

third kind. We observe for ^ ^ k , that O s V/'s k . For 
assigned values of / andU^, the following regions of selection 
of AS, are obvious : 

for O £ f g kl2 , and O < W< <f, 

or for k|^ < ^ £ k and ^:? < Ws k-^ ,. then ^ g s fg-W; 

iox Og s kje and ^ < W^< /^r- ^ , then Ws as, =f W\ 

for O s ^ £ kf ^ , and k- ^ g TV g k, 

or for kj^ s k and ^ s W s k , then W < < k \ 

for k/^S s f £ k and k-^ < Wg ^ , then ^ ^ K, s k , 

If we write m-l r~ m-j 




- /?7-J 

1 

r 1 

/ mjdt 

1 / 

J 



we have 


/ ) w-e/y*/ 




\frr}-ljJ\ 


r 


V/ 

7F 


> 77 -f-j)! 

[/■ m-jjFf 




142 CORRELATION BETWEEN CERTAIN AVERAGES 


over those regions of the ^M^-plane previously indicated. 
We shall consider two simple examples. 

Example 1. Let ffx)^e ^ (P& X <(X>. 

With samples of fhree items, 

^ /?: 0-Je TVs 

-je 


The regression is readily shown to be non-linear. 
Example 2. Let ^ 

With samples of three items, 




6W 


,j^{k-wj 
* •^( ), 


over those regions of the ^W^-plane which have been previously 
given, ^e mean of the array of W corresponding to an assigned 
r is ^ ’j-* Accordingly, there is no correlation between the 
median and the range in samples of three items drawn from this 
universe. 

It is easy to employ the type of argument used in establishing 
Theorem III to obtain the probability function of the median and 
lower quartile. Thus, if f{k) is a probability function of the 
second kind and the probability function of the median 

f and the lower quartile rf in samples of 4mi-l items, then 




r 

trovf 


m-J 





ON THE DEGREE OF APPROXIMATION OF 
CERTAIN QUADRATURE FORMULAS 

By 

A. L. O’Tooli^ 

National Research Pellotv. 

If a continuous function of period and if the 

interval under consideration, say the interval from O to ^77, 
divided into /t? equal parts by the m-t-l points x^^2irr/m, 

Z, , ,..,y77j then the trigonometric sum of the ^ 7 // order coinciding 
in value with ffxj at the points , or the trigo- 

nometric sum of the rjJ-/? order lacking the term in sin r?x , is, 
according as m ^ n ^ J or 

cos X’^a^cos ta^ cos nx 

h sin ^ ^ A sin ^ s/n nx 

fa /7 


or 

jQTo a, cos K cos 2x-h -h^ a„ cos nx 

^b^sinx y- 6^ sin idx y * • • / sinf^n-Jjx, 

where 


u fr? 

a ^ jr 12 Jcas , /?* , 

b ^ 

^ ^ f C^J sin kx^. 


If the Fourier coefficients of f(x) be denoted by 


1 

^ y * kx c/x, 

“iiL f kx (^x. 


S g* 



144 APPROXIMATION OP QUADRATURE FORMULAS 


then it has been shown^ that the interpolating coefficient^ and 
are approximations to the Fourier coefficients oc ^ and in 
the sense of the rectangle quadrature formula, in the sense of the 
trapezoid quadrature formula, in the sense of the average of the 
results of two applications of Simpson’s formula, and in the sense 
of higher quadrature formulas. In other words, the simple rec- 
tangle formulas and are as good approximations to the 
areas and as the estimates given by the trapezoid rule, 
the average of two applications of Simpson’s rule, or higher quad- 
rature formulas. 

It is the purpose of this note to discuss certain quadrature 
formulas and to observe some other conditions under which the 
rectangle formula will give as good an approximation as the more 
complicated formulas. 

The most elementary and best known of the formulas are the 
rectangle formula, the trapezoid formula, and Simpson’s formula. 
Many of the more complex rules are the results of attempts by 
different investigators- to improve by various devices the approx- 
imations given by these three simple rules. 

Suppose the area under consideration is bounded by the curve 
yw f6c) , the X. -axis and the ordinates at x»a and b , If the 
interval from ^ to ^ be devided into 77 equaF parts, say of 
length h , by the n-hl points 
if rectangles, each of width h and height 

be constructed, then the area as approximated by these n rec- 
tangles is 

( 1 ) A^hSy^. 


^D. Jackson, Some Notes on Trigonometric Interpolation, Amer. Math. 
Monthly, vol xxxiii, no, 8, October 1927. 

3See Range and Willers, Encyklopadie Der Mathematischen Wisscn- 
schaftcn, Bd. 11:3 (1915), pp, 45-176. 

^^Discussion from point of view of least squares, Otto Biermann, 
Monatshefte Fur Mathematik Und Physik, 14 (1903), pp. 226-242. 

For unequal intervals see Jas, W. Glover, International Mathematical 
Congress, Toronto, 1924. 



A. L. O’TOOLE 


145 


To find an expression for the error we assume the first deriv- 
ative exists, so that for the first rectangle 

- ffaj 1’- /^x-a J f 'fu). 


I 


a-t-h , ^ 

ffxJaCx - ^ fYzJ, a<z< a+h. 


Hence the error for the r? rectangles is 


(i«) 


'/’’i f'rtj- 




2r7 




i.e., an errof of the order of ^ , 

Let r)sm/r, . If we approximate the area in 

the first k subintervals by a parabola of degree k coinciding in 
value with f6c) at the first k values of x , then integrating 
Lagrange^s interpolation formula an expression for the error is 
obtained. If /r is odd then 







> /V= kh, 


where 


(k^tUk-efJat. 

If k is even, then making use of Rolle’s Theorem, 


___ 


^ " '--d’ ^ X? 


where 

The error over the whole interval will be obtained by summing 
the m errors corresponding to each k subintervals. 



146 APPROXIMATION OP QUADRATURR FORMULAS 


If 77 trapezoids are formed by joining the ends of successive 
ordinates then the area as approximated by th^ sum of the areas 
of these trapezoids is 

and the error is 
(2e) 

i,e., an error of the order of , 

Simpson’s formula may be obtained by passing second de- 
gree parabolas through the ends of three successive ordinates^ 
that is , and gives 


/-T ) 

J^r?^ 


(3) 




. h 


m 


^ ^ ~ ^ '^'^777 ^ 


L 


r7*j?rn* 


The error is 


(3e) 








ieOr)^ 

i.e„ an error of the order of ^ . 

To illustrate the fact that sometimes the rectangle formula 
(1) gives a better approximation than the Simpson formula (3) 
these formulas will be applied to the problem of finding the area 
under the so-called normal curve of error. J'rom a table^ giving 
five places of decimals it is seen that the ordinates to the right 
of 76 and to the left of x a- 4 76 are everywhere zero 


^i{ the equation be written in the form y 


1 -f 


Divide 


the interval from 460 to x^ASO into eight partial in- 
tervals each of length 1.20. Formula (1) gives A’*,99993 while 

<Jas. W. Glover, Tables of Applied Mathematics in Finance, Insurance 
^and Statistics. 



A. 1. OTOOLB 


147 


(3) gives 97 6^3 4^ the same ordinates being used in each 
case. 

There are three objections to the nature of Simpson^s form- 
ula. They are the lack of smoothness at the points of intersection 
of the parabolas, the unequal weights attached to the odd and 
even numbered ordinates, and the requirement that the number 
of ordinates be odd. 

Catalan® notices the lack of smoothness at the intersections 
of the parabolas used in setting up Simpson's rule and improves 
on it by passing parabolas through three successive ordinates and 
then retaining only the first half of each parabola except in the 
case of the last three ordinates where it is necessary to retain the 
whole parabola. To counterbalance the asymmetry introduced by 
these last three ordinates he repeats the process beginning with the 
last ordinat and then takes the arithmetic mean of the two results 
as his formula. 

This gives 


( 4 ) A^h 




And, of course, the error is still of the order of ^ . This 
formula has the additional advantage that it holds no matter 
whether n is even or odd. 

Similarly Crotti® showed that the different weights attached 
to the odd and even numbered ordinates in Simpson^s formula ic 
a disadvantage. And Parmentier^ by subtracting Simpson^s form- 
ula from twice Catalan's obtained a formula in which the weights 
are the reverse of those in Simpson's. Mansion® gave an alterna- 
tive derivation of Catalan's formula, his derivation requiring, 
however, an even number of ordinates. 


®E. Catalan, Nouvelles Annales, V* series (1851), pp. 412-415. 

«Crotti, II Politechnio 33 (1885), pp, 193-207. 

’Parmentier, Association frangaise pour Tavancement des sciences, Ses- 


sion Grenoble, 1882. 

^Mansion, Supplement zu Mathesis 1 (1881). 



148 APPROXIMATION OF QUADRATURE FORMULAS 


Catalanos formula may be thought of as the ‘rectangle formula 
plus three correctional terms- involving the first three and the last 
three ordinates. In the case of an even number of ordinates a 
formula® involving only two such correctional terms and giving 
an approximation of the order of the error in the single trapezoid, 
i.e,, of the order of ^ , the error in a single trapezoid of width 

, (b-a) 


being 




, can be obtained by applying 


Simpson’s formula to the first J?m-1 ordinates and approximating 
the remaining area by the trepezoid rule. Repeat the process from 
the opposite end and take the arithmetic mean of the two results 
as the quadrature formula. This gives 


( 5 ) 


iZ 


77^ Zm-J, 


It is the only formula with just two correctional terms which will 
give even this order of approximation in general because any 
change in the coefficients of these end ordinates will introduce in 
general an error of the order of the error in the rectangle formula 
for a single subinterval, i.e., an error of the order of ^ , 
Another important quadrature formula is called the three- 
eighths rule and is obtained by passing third order parabolas 
through four successive ordinates. It may be written 


( 6 ) 


‘‘ 6 


m 

y. 




>r7-U 

v»o ^ 


rrhi ^ 




The 


error is 


(6e) 




i.e., an error of the same order as the error corresponding to 
Simpson's formula. The error terms derived from the Lagrange 

"Durand, Engineering News, Jan. 1894. J. Lipka, Graphical and Me- 
chanical Computation, Part. II, p. 226. 



A, 1. O'TOOLE 


149 


fomula ‘'hows the advantage of using parabolas of even degree. 

Besides the fact that the order of the error is the same as 
that in the case of Simpson^s formula, this three-eighths formula 
has disadvantages similar to those mentioned in the case of Sitnp- 
son*s formula. There is still a lack of smoothness at the intersec- 
tions of the parabolas; the weights attached to the ordinates are 
as undesirable as before ; and the number of partial intervals must 
be a multiple of three. 

It is possible however to do away with these disadvantages 
by proceeding as follows. Pass a third order parabola through 
the first four ordinates * y, y • Retain only the area 

in the first two partial intervals. Pass a third order parabola 
through the four ordinates y si^d retain only the 

area in the central interval. Proceed in this way retaining each 
time only the area in the central interval until the last four ordi- 
nates are reached where it will again be necessary to retain the 
area in two strips, viz., the last two partial intervals. The sum, 
of these areas gives the required qifadrature formula. It is 



This formula holds for any n greater than or equal to three. 
From the' point of view .of the order of the error this formula 
is, as one would expect, no better than Catalan’s formula. As a 
matter of fact formula (7) can be obtained from formula (4) by 


subtracting from (4) ^ ^ quantity which, in ' 

general, is of the order of ^ . 

If r)«4m and fourth order parabolas are used in approx- 
imating the area in four successive partial intervals then the 
formula is 


( 8 )^ 


4h 

'^45 




"jf 



ISO approximation of quadrature formulas 

The error is, 

(8e) ’ 

i.e., an error of the order of ^ . 

Several modifications may be made to improve this formula^ 
For instance if then apply the fourth degree parabola 

to the ordinates X y yi f i!e > retain only the area 

in the first three strips. Apply a fourth degree parabola to the 
ordinates X f f and retain the area in the two 

central strips. And so on till in the final step it will be necessary 
to retain the area m the last three strips. Addition gives the 
formula 

( 9 ) A-^[s96£y,, - 653ry^.y^J.mry,.y^„J 

~^^6{yjg -f-iOOfy^ nr-* 

A formula which holds for any r? may be obtained by passing 
a fourth degree parabola through , Yf , ^ , Yj , and re- 
taining only the area between x and ^ , Pass a fourth degree 
parabola through Y^ , t Yf and retain only the area 

between and . And so on, retaining only the area in one 
strip, until at the end it will be necessary to retain the area in the 
last three strips. Repeat the process beginning at the last ordinate 
and take the arithmetic -mean. The result is 

( 10 ) ^yr>.i)-^('y2^yn-z ^ 

This formula can be obtained in the case of an even number of 
ordinates by retaining three strips at the beginning, two from 



A, L. oroom 


151 


then on, reversing the process and taking the arithmetic mean. 

Formulas (4), (5), (7) and (10) not only give, in general, 
at least as good approximations as' Simpson’s formula, the trape- 
zoid formula, the three-eighths formula, and the fourth degree 
formula (8) respectively, but in addition have the important prop- 
erty that under certain conditions they show that the simple rec- 
tangle formula must ^ive at least as good an approximation as the. 
higher formulas. If is a function such that the curve 

y « / actually, or at least for practical purposes, coincides 

with the ?c -axis to the left of x-cZ and to the right of 
then in dividing the interval from a to ^ into h equal parts 
each of length h it will not affect the area required if two, one, 
three or four partial intervals of length h are marked off to the 
left of a and to the right of b , the number of such partial inter- 
vals corresponding to (4), (5), (7) and (10) respectively. Hence 
it is seen that under these conditions (4), (5), (7) and (10) 
reduce to the simple rectangle formula (1). 

If the curve coincides with the x -axis at one end of the 
interval over which the area is required but does not at the other 
end then formulas (4), (S), (7) and (10) become respectively 

(4a) A - yn^J^ 

(Sa) y^>77-^ 

(7a) A*^hf 

( lOa) /}-= /? y„ y„,^ y^^ 

For example, consider again the formal curve of error and sup- 
pose that the area to the left of the ordinate at is required. 

Formulas (4a), (5a), (7a) and (10a) apply and for sixteen par- 



1S2 APPROXIMATION OP QUADRATURE PORMULAS 


tial intervals give respectively 4=:, 

an extra partial interval to the left of x»'-480 
being used in the case of (Sa) in order to have an odd number, 
of intervals for that formula, Using thirty-two partial intervals 
the same formulas give 4*, 49909 ^*^99^9 > JOOOO^ and 

JOOOO respectively. 

If, as often happens, the values of ordinates outside the in- 
terval over which the area is required are known then even better 
quadrature formulas may be obtained. For example, suppose that 
in deriving formula (7) the ordinate at a distance of h to 
the left of and the ordinate at a distance h to the right 
of are known. Then it will not be necessary to retain the 
areas in double strips at the beginning and end of the interval, 
and the formula for the area over the interval from to 

b is 


( 11 ) A^h\ 


* 


1 




It should be noted that ih case x? are known Cata- 

lan's formula reduces to (11). And, similarly, in the case of the 
derivation of formula (10) it will be necessary to retain the area 
in a single strip each time except in the case of the last application 
of the fourth degree parabola when it will be necessary to retain 
the area in the two central strips. The formula arrived at is 


~24d ^ ^ ^ ' 


Formulas (11) and (12) reduce to the rectangle formula (1) 
under the same conditions as in the cases of (4), (S), (7) and 
(10), Likewise when the curve coincides with the -axis to the 
left of AC « cc (11) and (12) become 



A. 1. O'TOOIB 


153 


(11a) ^-^yn-l) - 

(12a) A^hf y„^ ; y„ yn-t -£Zd ^r}-z '*'i44d y^3^ 


If we apply formula (11a) to finding the area under the normal 
curve to the left of the ordinate at >c^o and take h*i20, 

ciK-AdO then we 49999- In other words, in this case 

(11a) gives as good a result with six ordinates as (4a) or (7a) 
give with thirty-three ordinates or (6a) with thirty-four ordinates. 

Quadrature formulas involving parabolas of degree hjgher 
than four have been obtained but they are to be used with caution 
on account of the great freedom they allow the approximating 
curves. However, modifications similar to those in this paper could 
also be made for these higher formulas. And the effect of. any 
number of ordinates outside the ends of the interval could be 
noted. 

This note will be concluded with a remark on the effect of 
errors in the data giving the values of the ordinates. Suppose the 

quadrature formula is ^a^y, * *%yn^ 

and suppose further that each is subject to an error e/, i-0,1, 
2A, . If c is the greatest of the absolute values of the ,e/ 
then the error in A cannot be greater than 

\l a^,a„aj, ,< 2 ^ are all positive, as wilt be true if parabolas 

of the fourth degree or lower are used. But 
if the area is to be four from aS' <2 to x- A . Hence the error in 
A due to errors in the data is not greater than e^6-aj. When 
parabolas of degree higher than four are used the coefficients in < 
the quadrature formula are not always positive. 




Vol. IV 





in;nt.tsHKD quarterly by 
AMERICAK STATtSTXCAL ASSOCIATION 


PMicctikn Ojffc£*--»Edwar<U Brothers, InCv, Ann Arbor^ Michigan 

O#i<rr«-530 Commerce Kew York Univ., New York, N, Y. 


Entered at sec0nd class imitcr ai the Postofflcc at Am Arbor, MicK, 
under the Ad of Match Sr A, 1879^ 


STATEMISNT OF OWNERSHIP 
UNDER ACT OF CONGRESS OF AUGUST 24. 1912 

PMUher^Amerltim Statistical Association, New York New York, 

<H* G Carver, University of Michigan, 

Maif$aginff BtHtot^'EjaXt&n S< Sekhon* Atm Arbor, Michigan, 

Bushms W, Edwards, Ann Arbor, Michigan, 

OttWtf^^merlcan Statfeticat Association, 530 Commerce Bldg>, New York 
. - " , ' City, 



POLYNOMIAL APPROXIMATION BY THE 
METHOD OF LEAST SQUARES 

By H. T. Davis 

1. Introduction. In an earlier article in the Annals of Math- 
ematics the author in collaboration with V. V, Latshaw published 
formulas and tables for the fitting of polynomials to data by the 
method of least squares,^ In that paper two ranges of the inde- 
pendent variable were considered, one from x=J to x=p , 
and the other from to . For the first range formulas 

were given for fitting polynomials of first, second and third de- 
grees to data and these formulas were reduced to tables. For the 
second range formulas were given for polynomials from the first 
to the seventh degrees, but these formulas were not then reduced 
to tables. 

It is the purpose of the present paper to supply the tables 
for the second range and hence to furnish a means of reducing 
to a minimum the numerical labor involved in fitting to data poly- 
nomials from the first to the seventh degree inclusive. Incidentally 
some novel mathematical aspects of the problem of polynomial 
approximation have been brought to light, particularly as it ap- 
plies to the existence of a set of polynomials which are orthogonal 
for a summation over discrete intervals. 

The tables have been computed in the statistical laboratory of 
Indiana University and have been checked by duplicate calcula- 
tion. The computation has been made possible by grants of funds 
by the Waterman Institute of the University. The author is par- 
ticularly indebted to Dr. V. V. Latshaw, Miss Irene Price, Byron 
Shelley, George Davis, and Miss Anna Lescisin for the work 
which they have done in connection with the various computations 
of this paper. 


^Volume 31 (1930), pp. 52-78. 



156 


POLYNOMIAL APPROXIMATION 


Z, Formulas. Let us first consider the data to be given as 
a set of equally spaced items : 


y 

1 


■ 

y 777 

^ i 



^ ••• 

• • ^.77 


in which we assume that the ditference is constant* 

If m is an odd number, ^ we select zero as the 

center of the -range and without loss of generality we replace 
the table just given by the following: 


y 



• y-r 

yo 

yf ■ 

‘ Yjo-r 

y<7 



-P+1, • • 

• -I 

a 

1 . ■ 

■ * p-1 



Let us designate by the moments, 

(1) Mr=i: >r7. 

llm IS an even number, ^ we must make a slight 

change m the notation and consider the distribution, 


y 

y-p ' *_ 

y-x 

y-/ 



' y/o 


-r2p-i)/£ 


- 1/2 


y/2 

. . r2p-v/2 


The r-t/i moments, , will be correspondingly equal to, 

The method of least squares is then employed as described 
in the previous paper to determine the coefficients of the poly- 
nomial, 

(2) y ^ -f- + 7. 

It Will be unnecevssary to repeat the explicit formulas obtained 
since they have been given m the previous paper, but it wnll be 
useful in explanation of the notation of the tables to give the fol- 



d r. DAVIS 


1S7 


lowing determination of the coefficients as linear functions of 
the momcntb 


7 he straujht Hue, 






y= a^^a,K. 




d. I'hc parabola, y-cz^'^ < 2 ^^ y- 




(4) determined from (3). 


3. The cubic, y = jc 


a, ^A'Mj+B‘Mj, 

(5) and a, determined from (4). 


4. The qua: tic, y= cz^ -/■ a^x +■ a^x^-f-a^x'^+a^x f 


(6) and a, determined from (S). 


5. The quintic, *cz^x^-^a^x'^-e-ayc%c^x'^ 

(7) a^pydnd determined from (6). 


6. The sextic, y’‘ag^afX-^c7^X‘^-f-ajX^^a^x‘*^aj.x'^a^x^ 


-The nutation tollows that of the previous paper. It should be noted 
that the cuefliLients >/ , ^ , C , etc. for the straight line, the parabola, 
the quaitiL, and the scxtlc. and the coefficients A^j C\ etc. for the 
cubic, the quintic, and the septic are all given by different formulas, but 
it is huped that the omission of subscripts denoting the degrees of the poly- 
nomials will lead to no confusion. 



15H POlVX^OMIAl APPROXIMATION 

(8) < 3 :^, <3jand determined 

from (7). 

./ 7 /% ^ GM^^IM^ , 

7. The septic j y« cz^ czj^ ^ 

C3: ^ A C'/% ^JP'A7r , 

(9) nnd determined 

as- from (8). 

3. Orthogonal Polynomials. In a paper the significance ot 
which has perhaps never been fully appreciated, J. P. Gram in 
vestigated the problem of polynomial approximation over disci etc 
intervals by means of orthogonal polynomials.® This method ha^ 
since been more fully investigated by Edward Condon'^ and his 
work was made the basis of a method for obtaining least squares 
polynomials by R. T. Birge and J. D. Shea.® The work of the 
latter, however, while effecting a simplification, does 'not reduce 
the problem to its simplest form. 

In a recent paper issued by the Hungarian National Commit- 
tee "on Economic Statistics, Karl Jordan has employed orthogonal 
functions in connection with binomial moments and has very 

®tlber die Entwickelung reeller Functionen in Reihen mittelst dcr 
Methode der kleinsten Quadrate. Journal fiir Math., vol. 94 (1883), pp. 
41-73. 

*The Rapid Fitting of a Certain Class of Empirical Formulae by the 
Method of Least Squares. Univ. of California Publications in Mathematics, 
vol. 2 (1927), pp. 55-66. 

®A Rapid Method for Calculating the I, east Square Solution of a 
Polynomial of any Degree. Ibid,, pp. 67-118, 



n r DAVIS 


159 


greatly simplified the numerical work of curve fitting.” The poly- 
nomials which he employs, however, appear in the form, 

y* ('x-JXx-jS)-h , 

although in the final result they are numerically equivalent to the 
polynomials of the present paper. 

Let us begin with a set of polynomials, 

Cx), 

of degrees •-% rr? respectively such that, 

/C7 

^ for 


Assuming the existence of such a set of polynomials we can 
approximate by means of them a function which is defined 
over the set of integers from -/7 to . 

Writing the approximation equation, 


we multiply by sum from ~p to p . ^We then obtain, 

(11) 27 f fx) ^ , where \5m- 22 ('x) . 

If we represent the polynomial the series, 


“See Berechnung der Trendhnie auf Grund der Theone der kleinstcn 
Quadrate, Budapest (1930) and Praktische Anwendung der Trendberech- 
nungs-Methode von Jordan, by A. Sipos, Budapest (1930), 



POLYNOMIAL APPROXIMATION 


m 

it is clear from the definition of the moments (1) that we get 
from (10) the evaluation, 

( 12 ) = 

That these coefficients are identifiable with those explicitly 
given in equations (3) to (9) is a consequence of the following 
consideration: 

Let us approximate by minimizing the following sum: 

p 

which is equivalent in its result to the somewhat different method 
employed in the actual determination of the formulas (3) to (9), 
Taking the derivative of J with respect to /4^and equating 
the result to zero we get, 

whence, recalling the orthogonality of the pwl; pouimIs, 

(13) -4. “ jg 4. 0^)Ar . 

We thus see that the ratios can be written down ex- 

plicitly by comparing them with the corresponding coefficients of 
/% in equations (3) to (9), 

In particular, if r^my we find the coefficients of the poly- 
nomial equating the right member of (13) with the 

corresponding last row in the formulas (3) to (9). For example, 








']■ 



11 T, DAVIS 


161 


t£ r^S y have, 




-s> 


Hence we gel, 

(4=crly, 4.-^^ 




r- a <i?. 


!By means of this identification we obtain as the hist seven 
polynomials the following:’^ 

<^0 > 

/x)^ 3x/pfp^jX 2 p 

{p(p->-lX4p=^-i)(^p-f-3) / 1 '^'j ) “ 




4prp^-J)f^P^ J)(4/:^-9X^p^3)(pX ' 

[.■*. M.eiMpiil },fx% ex'^^ c, 

\ 7 35 ) 

}. 


. ^ , r j^Vjj ) 

Up^p^-jx^p^-yy^p^- 9)(p'-Wp^■5)(pt^) 5 




J S(4p^^2p-3)^^ a5p'^+30p^-35p^-50pPZ k 


9 


63 


} 




= F'k^+E'k^-i-C'x, 

f _ _ 9^JXJJL1X- 


0+7 1 


\4p(p^^JXp^^9-X^P^-9X4p^-44rP'3)(2p-> 

f s73p^+3p-rJxX 7SpXjO p^-3 0p ^-45p+14)x ^ 

V'~ li " ii 

Sp7p^-lXp ^- 4Xp>-3) ] ^jxKlx^+Ox^'^+JP, 

3 7J1 I 


r u? JJ'^ 13^ ^ 

\xZ7^XXfp777XJp2,9)(4^7fX4p^9^4p^Z5)(pi-4)(9pD\ 


s 


T - 773p^+3p-10)x^ 77i5p-^+30p 90p'^- 793p ^ lOJ )x^ 

— T3 irj3~ 


^X' + _ _ 

- 35p^+i05 p^-Z80p'^- 73Sp^+-49Tp ^+388p -180 } 

^j‘x^ + l!x^-^0'xiXl>'x: ' 

In order to cfTcct the romputatioii of the sum 3 ^=^ 


"See note at end of this ^cclicjn 



162 


POLYNOMIAL APPROXIMATION 


we replace the value of given by (13) in (10) and compare 
the coefficient of ;i;^with the corresponding coefficient in the 
proper foimula of the set from (3) to (9). Thus for since 

we have from the quintic 

approximationj 

ffxjc: ^ ^ terms of lower degree, 

^ ( "S^ ~j — y ^ terms of lower degree, 

^ 

J — ^ ^ y- terms of lower degree, 

1 K? Os / 

Equating the coefficients of it is clear that 
4- The Rectivsion Formula, It will be obvious from the pre- 
ceding discussion that the polynomials which we have investigated 
are essentially the analogue of the well-known Legendre poly- 
nomials, where the integration between the limits -J and 1 
used in the definition of the latter’s orthogonality is here replaced 
by the discrete sum over the integers from -p to -^p . We might, 
therefore, expect to find a recursion formula connecting any suc- 
cessive three of the new polynomials similar to the recursion 
formula which exists for the Legendre case. It turns out that 
this expectation is justified and we find the following relationship 

holding between ^ ' 

( 14 ) - 

From this equation we easily deduce that the coefficient, ^yu/> 
related to the coefficient, ^2^ , of in 



H r. DAVIS 


m 

^ /^as follows : 

From this we obtain by iteration and a proper change in no- 
tation, the following value for ^ ; 

]l{rn0^^pr^p^-j). 

(15) / 

The value thus obtained is at once seen to be equal to 

/O 

(X? r>:). 

~p 

As an example consider the case where We then have, 

where we abbreviate, ar-s = 3‘^-5‘yp^rp+lJ^C4p^-JjY‘^p+'^^^ 
We then obtain, 

- cr^{^^JJt2’p^-^A&pf^3p-X) 

^pYp^V^f'^p-f-^)/ 9 }, ^ 

= ^^^/prp^-mpWt^p^s) 

which is seen to agree with the value of 6j^ as calculated directly 
from (15).« 

'‘^The dehnition of the which we have given above was cfiosen 

for the obvious connection which the functions in that forin have with the 
problem of curve fitting and with the computed values in the tables. If, 
however, the coefficient of were reduced to unity, 

^^(i£)=fC‘*-p('p*JJl3, 

etc., then the recursion formula (14) would have been, 

If, moreover, ^ as just defined were multiplied by the coefficient 
iS5..,^f7-^J/77/ , which is the multiplier of the corresponding 

I^egendre polynomials, then the recursion formula becomes, 



162 


POLYNOMIAL APPROXIMATION 


we replace the value of given by (13) in (10) and compare 

the coefhcient of ;K'^with the corresponding coefficient in the 
proper formula of the set from (3) to (9). Thus for ^=5", since 

we have from the quintic 

approximation, 


f(x)= /• F /Vj + -i- terms of lower degree, 


= F'x 





■h terms of lower degree, 


^ 

J — ^ ^ ^ ^ ^ terms of lower degree, 

^ Os / 

Equating the coefficients of it is clear that = A*"! 

4. The Recursion Formula. It will be obvious from the pre- 
ceding discussion that the polynomials which we have investigated 
are essentially the analogue of the well-known Legendre poly- 
nomials, where the integration between the limits -J andv^/ 
used in the definition of the latter’s orthogonality is here replaced 
by the discrete sum over the integers from -p to , We might, 
therefore, expect to find a recursion formula connecting any suc- 
cessive three of the new polynomials similar to the recursion 
formula which exists for the Legendre case. It turns out that 
this expectation is justified and we find the following relationship 

holding between Mt-/ ' 

( 14 ) 

From this equation we easily deduce that the coefficient, 

related to the coefficient, , of in 



H. r. DAVIS 


163 


follows: 

From this we obtain by iteration and a proper change in no- 
tation, the following value for ^ ; 

]j[{niy2p0p<V‘ 

(15) / 

\fpXn-V*\r^y0^r,X^p^r7^J)). 
The value thus obtained is at once seen to be equal to 
sSr, -27 r^J. 

-jO 

As an example consider the case where We then have, 

^ p^rp-hl) > 

where we abbreviate, cz^ = B'*'.5^/pYp*1X(^/X-JX('^P'^bX. 
We then obtain, 

- aP\^p+lX^P^-JX3p'^-*-'3p~^) ■/?pYp*JXt2’pi-l) 

■*■ pYp*iX(^<^p-^Jj/ 9], ^ 

. cz^ (prp^ijr^pwr^p^sx^h <s^s/prp^mpijx^p^j; 

which IS seen to agree with the value of >5^ as calculated directly 
from (15).® 

^The dehnition of the which we have given above was chosen 

for the obvious connection which the functions in that form have with the 
problem of curve fitting and with the computed values in the tables. If, 
however, the coefficient of were reduced to unity, 

fi^fieJ=^‘^-p('p-^JJ jj, 

etc., then the recursion formula (14) would have been, 

If, moreover, as just defined were multiplied by the coefficient 

I'S 5,^. J/ 77 / , which IS the multiplier of the corresponding 

Legendre polynomials, then the recursion formula becomes. 



164 


POLVNOAfiAl APPR0X1MA710N 


5. The Polynomuils of Gram. It will be at once evident that 
the results obtained above permit us to define a new set of poly- 
nomials which are orthogonal over the discrete range from 
^ = / to = 

In the former txiper in the Annals it was proved that the 
formulas for the coefficients of the least square polynomial, 

y » ^ ^ 

fitted to data given over the discrete range to>«:=r/o', can 

be obtained from the coefficients, equa- 

,tion (2), by means of the following substitution; 


M -/77 - 




J/ I 



where Af^r arc the moments defined by (1) and /t?/* are the 
moments, 

Conversely we can pass from the range to 

to the range to , ])y means of the substitution: 










p'=2pi-J, 


2 / 







Replacing and by a: it is clear that new polynomials 
obtained which belong to the range to • 

The polynomials may be explicitly evaluated from 



H r. DAVIS 


165 


as follows: 

-f' m(yn-J[J b^ ^ 

where and denotes the value of ^ after 

the substitution 

These polynomials can be proved by the method of section 3 
to be orthogonal over the discrete range x^^ito -t^^^a^ancl they are 
identifiable with the last lines of the formulas (3), (4), and (5) 
of the Annals paper previously cited where the are replaced 
by . 

Polynomials orthogonal over the disci ete range x*0 to 
were first obtained by Gram in the paper to which we previously 
refen eel and hence (16) ma} properly ])c called the Grajii poly- 
nonml of rn f/? degree. 

The following explicit formula, in the notation of the present 
paper where the range is from / tox^ p , was deri\ed by 
Gram : 

)/r ^ ) f'r77-hlJ(p-^Z)lfx-f) 

^ I j /97 

pTy-hi J(n7i^^Xn7-i‘3)fp-4)!fx-l)fx-2){x-3) ^ 

Since the coefficient of x.'" m fCfcy equals ^ 

and since the 
. - /im) 

coefficient of in Gram's definition J ( 

it is clear that the following equation holds between 




166 


POLYNOMIAL APPROXIMAirON 


'^pfp'^~JXp^-4j . . fp-mVf^n7)//4'^.XXs4 . . . 

By methods previously used it can be shown that, 

P 

IL 


The first four Gram polynomials are given below explicitly:® 

^(^xj ^i/p, 

^/x)J\l2fpfp^-lJ\ \_x~Cp-f-Vl^ , 

Y8oiprp^-xrp^-4)\ [x Up^i)^ ^ rp^ixp^^jj^ , 

)l//x)> \£800lprp^-JXp^^4)f'p4 9)\ \x'^-srp^jj + 

^f6p^-^J5p-h4lJx -fp-^Jjfp^Xfp^X! . 

6. Tables and Numerical Application. In tables 1 to 7 the 
numerical values of the coefficients of equations (3) to (9) have 
been tabulated for values of p by half integers. For the case of 
the straight line the range of p is from 0.5 to 100.0; for the 
parabola the range is from 1.0 to 100.0; for the cubic the range 
is from 1.5 to SO.O; for the other polynomials the range does not 
exceed p^ 2S0 The tables have been computed to ten signifi- 
cant figures and have been checked by duplicate calculation. 

®These polynomials are essentially the same as those employed by 
Jordan (loc. cit.) except that the summation in his work has taken over 

the numbers 0, 1, 2 77^J , His polynomials are also expressed 

in teims of the Newton polynomials: . 



H T DAVIS 


167 


111 illustration of the application of these tables to the numer- 
ical problem of polynomial approximation, we shall fit polynomials 
to the data employed by Karl Pearson in the same connection, his 
method being, however, the method of moments. The data are 
from T. N. Thiele^^ and consist of a system of fiequencies ob- 
tained from a game of patience (solitaire) : 


Value of character 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Frequency 

0 

3 

7 

35 

101 

89 

94 

70 

46 

30 

15 

4 

5 

1 

0 

Class marks (x) 

B 

Bi 

B 

m 

“3 

-2 

-1 

0 

1 

2 

3 

4 

5 

6 

_2_ 


Computing the moments, using the values of k for this pur- 
pose, we obtain the following: 


/%--J4J9703. 

These values are then substituted in equations (3) to (9) 
and the coefficients of the desiied polynomials thus obtained. In 
illustration we shall give only the computations foi the parabola. 

From the value corresponding to p^7 in column of 
table 1 we obtain, 

OJS 7i4; 

Similarly fiom the values corresponding to p-7 in columns 
A ^ 3 ^ and ^ of table 2 w'e compute, 

c2 ^-/% . 1311312477-/% ’72)432 4386 373 - 63221 719, 

a^--/4^.72) 452 4386373 ^73)242 4046 542-16011635 

the Systematic Fitting of Curves to Observations and Measure- 
ments Biometrika, vol. 1 (1902), pp. 265-303; vol 2 (1903), pp. 1-27, in 
particular, p. 18. 

^^Forelaesninger over Almindehg lagttagclscslacrc. Copenhagen, (1889), 

p. 12 









168 


rOLYNOMIAL APPROXIMATION 


Proceeding m this manner the other coefficients are easily 
computed and we obtain the following seven polynomials of ap- 
proximation : 

y= JJ GJ3333-20357M x, 

y. 63 £217J9-2. OJS7J4x - J. 60Jl63x^, 

y. 63. 441 719- 9 944644X - /. 601163 . 436195 x f 

y= 75 038367- 9 944644 x~3.760517x^-^.436195x^.^ 0456664^ 

y= 75 056367-40. 40S690X - J 760517x^-f-l 139793x^t- 043666:^ 
- 014942 

y = 73 950386 -40 40S69Ox - 3.340566 1 139793 x-^ 

■h. 040390 X 014942 X 000338551 x f 
y = 73. 950386 - 45. 763034x- 3 320586 x'^^ 4 070344 x^ 

+. 040890 x'^-. 054J18x^-f-. 000338551 x -^.0004607106 X f 

The following table contains the values computed from these 
polynomials over the range from »=-J \o x=y ; 


m 

m 

/?»! 

2 

3 

>7*4 

/7 « S 

>7S 6 


-7 

0 

47 583 

- 985 

-26 778 

-ll.lOS 

3.123 

3.916 

.591 

-6 

3 

45 548 

17.794 

14.109 

7,593 

- 8,863 

-10.448 

-3.483 


7 

43,512 

33 371 

43.291 

29 685 

15.773 

15,469 

12.432 

-4 

35 

41 476 

45.746 

62,185 

51.163 

50,538 

51.513 

45 975 

-3 


39,440 

54.918 

72 208 

68.309 

78 982 

80.073 

79.537 

-2 

1 89 

37 405 

60 888 

74.777 

78 707 

92.918 

93.194 

97 660 

- 1 

1 94 

35.369 

63.656 

71.309 

81.032 

90 625 

89.932 

94 397 

0 

mm 

33.333 

63,222 

63.222 

75 058 

75.058 

73.950 

73 950 

1 

46 ' 

31.298 

59 585 

51,932 

61.655 

52 062 

51.370 

46.905 

2 


29,262 

52,746 

38 857 

42.787 

28.576 

28.853 

24.388 

3 

15 

27.226 

42.704 

25.415 

21.516 

10.843 

11.935 

12.471 

4 

4 

25.190 

29,460 

13 021 

1 999 

2.624 

3.599 

9.136 

5 

5 

23,155 

13,014 

3 094 

-10.512 

3 400 

3 095 

6 133 

6 

1 

21.119 

-6 634 

-2 950 

- 9.667 

6 589 

5 005 

- 1 959 

7 

HD 

19 083 

-29.485 

- 33.693 

11 980 

- 2,249 

- 1.457 

.869 


The approximation attained by these polynomials is exhibited 





T. DA]' IS 


109 


in the foHowin^^ the odd order ciises licin^ ,i^ivcn in h^nre 

1 and the even order c.i^-cs in figure 2. 


In 01 del to illustrate the case wlicic the nuinbei of items is 
even we shall delete the last value from the senes which we have 
]ust used. The ti^lc must then he arranged as follows' 


Frequency 

(9 

3 

7 

55 lOl 89 

9^ 

70 

Class mailv 

-13l2 

-lJl2 

-914 

-7l4 -512 -5l2 

-112 

1/2 

Frequency 

46 

30 

JJ 

4 5 1 



Class maik 

5ll 


f/2 

9l2 /V-e 13l2 




The method will he sufficiently illustiated by means of the 
fiist, second, and fifth degice pohnomials To compute these we 
flist obtain the moments. 


/%: = 25, J 652 5. 

In order to e\aluate the coefficients of the parabola we use 
the value corresjioiiding to /o= 6.5 in column 6 ' ol table 1 and 
the values in columns A , Q , and C of table 2 as follows 
r2j A59 560 A J06 . ~ J. 406 594 
162 JOpJ 450-/% 64j 558 0557143=63 514 754, 
f4)558 0557143 ^ ('5jJ434065934 = -4 00618/ 

The other coefficients aic similarly obtained and we thus de- 
rive the following appioximating ixilynomials ; 

y - 55 714266-1 406 595 x. 

y= 63 314754 -1. 406593 x -2 006161 x f 

y^83 3725O-1665150X-5 172054x'^-^l 149221x'^-^ 

■017601x7 



170 


POLYNOMIAL APPROXIMATION 


The approximating values obtained from the parabola and the 
quintic are recorded in the following table : 


X. 




X 

X 

n-^ 


-6.5 

0 

- 7.304 

1 492 

\ 0 5 

70 

67.110 

73. 9J2 

-5.5 

3 

15.364 

- 12 686 

1 1.5 

46 

61 691 

50 922 

-4.5 

7 

34 019 

13.214 

2.5 

30 

52 260 

28.698 

-3.5 

35 

48.662 

49.869 

3.5 

15 

38.816 

13.295 

-2.5 

101 1 

59.293 

79.419 

4.5 1 

4 ! 

21,360 

7.280 

-l.S 

89 

65, 9U 

93 329 

5.5 

5 

-.108 

7.592 

-0.5 

94 

68.516 

90.257 

il 6.5 . 

1 

- 25 . 589 

3.4o7 


In the article from which these data are taken, Karl Pearson 
compares the efficacy of polynomial curves with that of skew- 
frequency curves and shows the superiority of the latter in the 
present case. It is worth noting here, however, that the least square 
polynomials of the present paper give a measurably better fit than 
the moment polynomials employed by Pearson. The sum of the 
squares of the deviations from the data of the values obtained by 
means of the sixth degree parabola of Pearson is found to be 
1402.31 ; the same sum for the sextic of the present papdr is 
1091.22. For the septic the sum of the square of the deviations 
is 926.32, which compares not too unfavorably with the sum 760,91 
obtained from the skew-frequency curve. 






172 


POI^’XO^nAL APPROXIMAIIOX 


table I 


(The numbers m parentheses denote the number of ciphers 
between the decimal pomt and the first significant iigrne.) 


yO 


/i ' 

0 5 

.500 0000 QOO 

2.000 0000 000 

1.0 

.333 3333 333 

.500 0000 000 

1.5 

.250 0000 000 

.200 0000 000 

2.0 

.200 0000 000 

,100 0000 000 

2.5 

166 6666 667 

.(1)571 4285 714 

3.0 

142 8571 429 

.(1)357 1428 571 

3.5 

.125 0000 000 

.(1)238 0952 381 

4 0 

111 1111 111 

, (1)166 6666 667 

4.5 

.100 0000 000 

. (1)121 2121 212 

S.O 

.(1)909 0909 091 

(2)909 0909 091 

5.5 

. (1)833 3333 333 

.(2)699 3006 993 

6 0 

,(1)769 2307 092 

(2)549 4505 495 

6.5 

.(1)714 2857 143 

.(2)439 5604 396 

7 0 

(1)666 6666 667 

(2)357 1428 571 

7.5 

.(1)625 0000 000 

.(2)294 1176 471 

8.0 

.(1)588 2352 941 

(2)245 0980 392 

8 5 

.(1)555 5555 556 

.(2)206 3983 488 

9.0 

.(1)526 3157 895 

.(2)175 4385 965 

9.5 

,(1)500 0000 000 

.(2)150 3759 398 

10.0 

.(1)476 1904 762 

1 .(2)129 8701 299 

10.5 

.(1)454 5454 545 

.(2)112 9305 477 

11.0 

,(1)434 7826 087 

(3)988 1422 925 

11 5 

.(1)416 6666 667 

.(3)869 5652 174 

12 0 

,(1)400 0000 000 

.(3)769 2307 692 

12.5 

' .(1)384 6153 846 

. (3)683 7606 838 

13.0 

.(1)3703703704 

.(3)610 5006 105 

13.5 

,(1)357 1428 571 i 

.(3)547 3453 749 

14.0 

.(1)344 8275 862 

.(3)492 6108 374 

M.5 1 

,(1)333 3333 333 

.(3)444 9388 209 

15 0 

.(1)322 5806 452 

(3)403 2258 064 

15.5 

.(1)312 5000 000 

,(3)366 5689 ISO 

16.0 

,(1)303 0303 030 

.(3)334 2245 989 

16 5 

.(1)294 1176 471 

.(3)305 5767 762 

17 0 

.(1)285 7142 857 

(3)280 1120 448 

17.5 

.(1)277 7777 778 

.(3)257 4002 574 

18.0 

.(1)270 2702 703 

(3)237 0791 844 

18.5 

.(1)263 1578 947 

.(3)218 8423 241 

19.0 

' . (1)256 4102 564 

. (3)202 4291 498 

19.5 

.(1)250 0000 000 

.(3)187 6172 608 

20.0 

.(1)243 9024 390 

.(3)174 2160 279 

20 5 

(1)238 0952 381 

.(3)162 0614 213 

21 0 

. (1)232 5581 395 

.(3)151 0117 789 

21.5 

. (1)227 2727 273 

.(3)140 9443 270 

22.0 

. (1)222 2222 222 

.(3)131 7523 057 

22 5 

.(1)217 3913 043 

.(3)123 3425 840 

23.0 

.(1)212 7659 574 

(3)115 6336 725 

23,5 

.(1)208 3333 333 

.(3)108 5540 599 

24 0 

.(11204 0816 327 

(3)102 0408 163 

24,5 

.(1)200 0000 000 

. (4)960 3841 537 

25.0 

.(1)196 0784 314 

.(4)904 9773 756 





H. T. DAVIS 


173 


TABLE I — ( Continued) 



>4 


25.5 

.(1)192 3076 923 

. (4)853 7522 411 

26.0 

.(1)188 6792 453 

.(4)806 3215 610 

26.5 

.(1)185 1851 852 

.(4)762 3403 850 

27,0 

.(1)181 8181 818 

.(4)721 5007 215 

27. S 

.(1)178 5714 286 

. (4)683 5269 993 

28,0 

.(1)175 4385 965 

.(4)648 1721 545 

28.5 

.(1)172 4137 931 

. (4)615 2142 484 

29.0 

.(1)169 4915 254 1 

.(4)584 4535 359 

29.5 

.(1)166 6666 667 

. (4)555 7099 194 

30.0 

.(1)163 9344 262 

.(4)528 8207 298 

30.5 

.(1)161 2903 226 

.(4)503 6387 903 

31.0 

,(1)158 7301 587 

. (4)480 0307 220 

31 5 

.(1)156 2500 000 

.(4)457 8754 579 

32 0 

.(1)153 8461 538 

,(4)437 0629 371 

32 5 

.(1)151 5151 515 

.(4)417 4929 548 

33.0 

.(1)149 2537 313 

.(4)399 0741 480 

33 5 

.(1)147 0588 235 

,(4)381 7230 981 

34 0 

.(1)144 9275 362 

.(4)365 3635 367 

34. S 

.(1)142 8571 429 

,(4)349 9256 408 

35 0 

.(1)140 8450 704 

.(4)335 3454 058 

35.5 

.(1)138 8888 889 

.(4)321 5640 877 

36 0 

.(1)136 9863 014 

. (4)308 5277 058 

30 5 

.(1)135 1351 351 

.(4)296 1865 976 

37 0 

.(1)133 3333 333 

.(4)284 4950 213 

37 5 

.(1)131 5789 474 

.(4)273 4107 997 

38 0 

,(1)129 8701 299 

,(4)262 8949 997 

38.5 

.(1)128 2051 282 

.(4)252 9116 453 

39.0 

.(1)126 5822 785 

. (4)243 4274 586 

39 5 

.(1)125 0000 000 

.(4)234 4116 268 

40.0 ! 

.(1)123 4567 901 

.(4)225 8355 917 

40 5 

.(1)1219512 195 

.(4)217 6728 595 

41 0 

.(1)120 4819 277 

.(4)209 8988 288 

41 5 

.(1)119 0476 190 

.(4)202 4906 348 

42,0 

.(1)117 6470 588 

.(4)195 4270 080 

42 5 

.(1)116 2790 698 

.(4)188 6881 457 

43.0 

(1)114 9425 287 

.(4)182 2555 952 

43 5 

.(1)113 6363 636 

.(4)176 1121 482 

44 0 

.(1)112 3595 506 

,(4)170 2417 433 

44 5 

,( 1)111 nil 111 

1 ,(4)164 6293 781 

45 0 

.(1)109 8901 099 

1 

. (4) 159 2610 288 

45 S 

.(1)103 6956 522 

.(4)154 1235 763 

46.0 

.(1)107 5268 817 

.(4)149 2047 387 

46 5 

.(1)106 3829 787 

.(4)144 4930 102 

47,0 

.(1)105 2631 579 

, (4) 139 9776 036 

47 S 

.(1)104 1666 667 

.(4)135 6483 993 

48.0 

. ( t ) 103 0927 835 

.(4)131 4958 973 

48.5 

.(1)102 0408 163 

.(4)127 5111 732 

49.0 

.(1)101 0101 010 

.(4)123 6858 380 

49.5 

.(1)100 0000 000 

.(4)120 0120 012 

50 0 

. (2)990 0990 099 

,(4)116 4822 365 





174 


FOLYl^OMJAr. APPROXIMATION 


TABLE l~(CQntinued) 


/O 



50.S 

.(2)980 3921 569 

.(4)113 0895 500 

5L0 

.(2)970 8737 864 

.(4)109 8273 514 

51.5 

.(2)961 5384 61S 

.(4)106 6894 271 

52 0 

.(2 ) 952 3809 524 

.(4)103 6699 ISO 

52 5 

. (2)943 3962 264 

.(4)100 7632 819 

53.0 

. (2)934 5794 393 

. (5)979 6430 181 

53.5 

.(2)925 9259 259 

.(5)952 6803 662 

54 0 

,(2)917 4311 927 

.(5)926 6981 744 

54 S 

. (2)909 0909 091 

.(5)901 6522 778 

55.0 

. (2)900 9009 009 

.(5)877 5008 775 

55.5 

.(2)892 8571 429 

.(5)854 2043 940 

56.0 

.(2)884 9557 522 

. (5)831 7253 310 

56 5 

.(2)877 1929 825 

.(5)810 0281 485 

57 0 

.(2)869 5652 174 

.(5)789 0791 446 

57.5 

.(2)862 0689 655 

.(5)768 8463 461 

58 0 

.(2)854 7008 547 

.(5)749 2994 051 

5S.S 

.(2)847 4,576 271 

1 .(5)7304095041 

59.0 

.(2)840 3361 345 

.(5)712 1492 665 

59.5 

.(2)833 3333 333 

.(5)694 4926 731 

60.0 

1 .(2)826 4462 810 

.(5)677 4149 844 

60.5 

.(2)819 6721 311 

.(5)660 8926 677 

61.0 ! 

(2)813 0081 301 i 

.(5)644 9033 290 

61.5 

,(2)806 4516 129 

.(5)629 4256 491 

62.0 

.(2)800 OOOO 000 

.(5)614 4393 241 

62.5 

.(2)793 6507 937 

.(5)599 9250 094 

63.0 

.(2)787 4015 748 

.(5)585 8642 670 

63 5 

,(2)781 2500 000 

,(5)572 2395 166 

64 0 

.(2)775 1937 984 

. (5)559 0339 893 

64 5 

.(2)769 2307 692 

.(5)546 2316 842 

65.0 

.(2)763 3587 786 

.(5)533 8173 277 

65 5 

.(2)757 5757 576 

.(5)521 7763 354 

66 0 

.(2)751 8796 992 

,(5)510 0947 756 

66 5 

.(2)746 2686 567 

.(5)498 7593 361 

67.0 

.(2)740 7407 407 

.(5)487 7572 920 

67.5 

,(2)735 2941 176 

.(5)477 0764 754 

68.0 

. (2)729 9270 073 

. (5)466 7052 476 

68 5 

.(2)724 6376 812 

. (5)456 6324 725 

69.0 

.(2)719 4244 604 

. (5)446 8474 910 

69.5 

.(2)714 2857 143 

(5)437 3400 975 

70.0 

,(2)709 2198 582 

. (5)428 1005 180 

70.5 

. (2)704 2253 521 

.(5)419 1193 883 

71.0 

.(2)699 3006 993 

.(5)410 3877 343 

71 5 

.(2)694 4444 444 

.(5)401 8969 536 

72.0 

.(2)689 6551 724 

.(5)393 6387 970 

72.5 

. (2)684 931.5 068 

.(5)385 6053 522 

73.0 

.(2)680 2721 088 

.(5)377 7890 275 

73 5 

.(2)675 6756 757 

, (5)370 1825 370 

74.0 

.(2)671 1409 396 

, (5)362 7788 863 

74 5 

.(2)666 6666 667 

. (5)355 5713 587 

75.0 

.(2)662 2516 556 

. (5)348 5535 030 




H. r. DAVIS 


17S 


TABLE I — (Continued) 


p 

A 

A' 

75.5 

.(2)657 8947 368 

.(5)341 7191 206 

76.0 

.(2)653 5947 712 

.(5) 335 0622 546 

76.5 

.(2)649 3506 494 

. (5)328 5771 787 

77.0 

.(2)645 1612 903 

.(5)322 2583 868 

77, S 

.(2)641 02S6 410 

. (S)316 1005 832 

78.0 

.(2)636 9426 752 

.(5)310 0986 734 

78.5 

.(2)632 9113 924 

.(5)304 2477 550 

79.0 

.(2)628 9308 176 

.(5)298 5431 096 

79,5 

,(2)625 0000 000 

. (5)292 9801 945 

80.0 

,(2)621 1180 124 

.(5)287 5546 354 

80.5 

.(2)617 2839 506 

. (5)282 2622 188 

81.0 

. (2)613 4969 325 

.(5)277 0988 855 

81. S 

(2)609 7560 976 

.(5)272 0607 240 

82 0 

.(2)606 0606 061 

.(5)267 1439 639 

82. S 

. (2)602 4096 386 

. (5)262 3449 70S 

83.0 

.(2)598 8023 952 

. (5)257 6602 389 

83.5 

, (2) 595 2380 952 

. (5)253 0863 885 

84.0 

.(2)591 7159 763 

.(5)248 6201 581 

84.5 

.(2)588 2352 941 

.(5)244 2584 010 

85.0 

.(2)584 7953 216 

.(5)239 9980 800 

85.5 

,(2)581 3953 488 

. (5)235 8362 636 

86.0 

.(2)578 0346 821 

.(5)231 7701 211 

86.5 

.(2)574 7126 437 

. (5)227 7969 190 

87.0 

.(2)571 4285 714 

. (5)223 9140 170 

87.5 

.(2)568 1818 182 

. (5)220 1188 642 

88.0 

.(2)564 9717 514 

. (5)216 4089 957 

88. S 

.(2)561 7977 528 

. (5)212 7820 293 

89.0 

.(2)558 6592 179 

.(5)209 2356 621 

89.5 

.(2)SS5 SSS5 556 

. (5)205 7676 677 

90.0 

.(2)552 4861 878 

. (5)202 3758 930 

90.5 

. (2)549 4505 495 

.(5)199 0582 554 

91.0 

.(2)546 4480 874 

.(5)195 8127 404 

91.5 

. (2)543 4782 609 

.(5)192 6373 986 

92.0 

.(2)540 5405 405 

.(5)189 5303 438 

92.5 

. (2)537 6344 086 

.(5)186 4897 SOI 

93.0 

.(2)534 7593 583 

. , (5) 183 SW8 498 

93.5 

.(2)531 9148 936 

.(5)180 6009 315 

94.0 

.(2)529 1005 291 

.(5)177 7493 378 

94. S 

(2)526 3157 895 

.(5)174 9574 635 

95.0 

.(2)523 5602 094 

. (5) 172 2237 531 

95.5 

.(2)520 8333 333 

. (5)169 5466 999 

96.0 

.(2)518 M47 150 

. (5)166 9248 438 

96.5 

.(2)515 4639 175 

.(5)164 3567 692 

97.0 

.(2)512 8205 128 

,(5)161 8411 044 

97,5 

.(2)510 2040 816 

. (5)159 3765 191 

98.0 

. (2)507 6142 132 

. (5)156 9617 233 

98.5 

.(2) 505 OSOS 051 

,(5)154 5954 662 

99.0 

.(2)502 5125 628 

.(5)152 2765 342 

99.5 

.(2)500 0000 000 

.(5)150 0037 501 

100.0 

,(2)497 5124 378 

.(5)147 7759 716 




176 


POLYNOMIAL APPROXIMATION 


TABLE 11 


(The number in parentheses denote the number of ciphers between the 
decimal point and the first significant figure.) 



A 

3 

c 

TO 

1. 000 0000 000 

- 1.000 0000 000 

l.SOO 0000 000 

1,5 

.640 6250 000 

- .312 5000 000 

,250 0000 000 

2 0 

.485 7142 857 

“• 142 8571 429 

.(1)714 2857 143 

2.5 

.394 5312 500 

- ,(1)781 2500 000 

.(1)267 8571 429 

3.0 

.333 3333 333 

,(1)476 1904 762 

.(1)119 0476 190 

3 S 

.289 0625 000 

-.(1)312 5000 000 

.(2)595 2380 952 

4.0 

.255 4112 SS4 

-.(1)216 4502 165 

.(2)324 6753 247 

4.5 

.228 9062 500 

“.(1)156 2500 000 

. (2) 189 3939 394 

5 0 

.207 4592 075 

- .(1)116 5501 166 

.(2)116 5501 166 

S.S 

,189 7321 429 

- . (2)892 8571 429 

.(3)749 2507 493 

6.0 

,174 8251 748 

-,(2)699 3006 993 

.(3)499 5004 99S 

6.5 

,162 1093 750 

-,(2)558 0357 143 

.(3)343 4065 934 

7,0 

.151 1312 217 

-.(2)452 4886 878 

.(3)242 4046 542 

7.5 

.141 5550 595 

- . (2)372 0238 095 

.(3)175 0700 280 

8.0 

.133 1269 350 

- .(2)309 5975 232 

1 (3) 128 9989 680 

8 5 

.125 6510 417 

- . (2)260 4166 667 1 

,(4)967 4922 601 

9 0 

.118 9739 054 

-.(2)221 1410 880 

.(4)737 1369 601 

9.5 

.112 9734 848 

-,(2)189 3939 394 

.(4)569 6058 328 

10,0 

107 5514 874 

-.(2)163 4521 085 

.(4)445 7784 778 

10 5 

.102 6278 409 

-,(2)142 0454 545 

.(4)352 9079 616 

11.0 

.(1)981 3664 596 

- . (2) 124 2236 025 

.(4)282 3263 693 

11.5 

.(1)940 2316 434 

-.(2)109 2657 343 

.(4)228 0328 367 

12 0 

.(1)902 4154 589 

-.(3)966 1835 749 

.(4)185 8045 336 

12 5 

.(1)867 5309 066 

-.(3)858 5164 835 

(4)152 6251 526 

13.0 

(1)835 2490 421 

-.(3)766 2835 249 

,(4)126 3104 711 

13.5 

.(1)805 2884 61.5 

-,(3)686 8131 868 

. (4) 105 2587 260 

14 0 

.(1)777 -4069 954 

-,(3)617 9705 846 

.(5)882 8151 209 

14.5 

.(1)751 3950 893 

-.(3)558 0357 143 

.(5)744 8752 582 

15.0 

.(1)727 0704 824 

- (3)505 6122 965 

.(5)632 0153 706 

15.5 

.(1)704 2738 971 

-.(3)459 5588235 

.(5)539 0719 338 

16.0 

.(1)682 8655 216 

- (3)418 9359 028 

.(5)462 0616 575 

16.5 

,(1)662 7221 201 

-.(3)382 9656 863 

.(5)397 8864 273 

17.0 

.(1)643 7346 437 

-.(3)351 0003 510 

.(5)344 1179 912 

17.5 

.(1)625 8062 436 

-.(3)322 4974 200 

,(5)298 8393 081 

18.0 

.(1)608 8506 089 

- (3)297 0002 970 

. (5)260 5265 763 

18 5 

.(1)592 7905 702 

-.(3)274 1228 070 

. (5)227 9607 543 

19 0 

.(1)577 5569 190 

-.(3)253 5368 389 

(5)200 1606 623 

19 5 

.(1)563 0874 060 

-,(3)234 9624 060 

, (5) 176 3320 120 

20.0 

.(1)549 3258 868 

- (3)218 1596 056 

.(S)ISS 8282 897 

20.5 

.(1)536 2215 909 

- (3)202 9220 779 

.(5)138 1205 295 

21.0 

.(1)523 7284 931. 

-.(3)189 0716 582 

.(5)122 7738 040 

21.5 

.(1)511 8047 713 

-.(3)176 4539 808 

.(5)109 4288 253 

22.0 

. (1)500 4123 371 

-.(3)164 9348 507 

.(6)977 8746 091 

22 5 

.(1)489 5164 279 

-.(3)1.54 3972 332 

. (6)876 0126 706 

23 0 

.(1)479 0852 511 

-.(3)144 7387 466 

.(6)785 2017 048 

23.5 

(1)469 0896 739 

-.(3)135 8695 652 

.(6)707 9612 604 

24.0 

. (1)459 5029 501 

-.(3)127 7106 587 

.(6)638 5532 9,37 

24.5 

.(1)450 3004 80S 

-,(3)120 1923 077 

.(6)577 1539 385 

25.0 

,(1)441 4596 0 27 

- (3)113 2528 483 

. (6)522 7054 537 



n. T. DAVIS 


177 






178 


POLYNOMIAJ. APPROXIM^rnON 


TADLE II — (Continued) 


yO 


23 

C 

50,5 

.(1)220 6235 860 

--.(4)141 4027 149 

.(7)163 1099 278 

51.0 

.(1)218 4809 327 

-.(4)137 3230 250 

.(7)155 3427 884 

51.5 

.(1)216 3795 0,35 

-.(4)133 3987 877 

.(7)148 0152 984 

52.0 

.(1)214 3181 200 

-.(4)129 6226 684 

.(7)141 0986 957 

52.5 

.(1)212 2956 479 

-,(4)125 9877 439 

.(7)134 5663 486 

53.0 

.(1)210 3109 957 

-.,(4)122 4874 757 

(7)128 3935 804 

53.5 

.(1)208 3631 123 

-.(4)119 1156 852 

.(7)122 5575 085 

54.0 

.(1)206 4509 850 

-,(4)115 8665 310 

.(7)117 0369 000 

54 5 

.(1)204 5736 382 

-.(4)112 7344 877 

.(7)111 8120 384 

55.0 

.(1)202 7301 313 

-.(4)109 7143 258 

.(7)106 8646 031 

55.5 

,(1)200 9195 574 

-.(4)106 8010 936 

.(7)102 1775 591 

56 0 

,(1)199 1410 418 

(4)103 9901 001 

.(8)977 3505 652 

56.5 

.(1)197 3937 403 

-.(4)101 2768 991 

. (8)935 2233 857 

57,0 

,(1)195 6768 382 

(S') 986 5727 449 

(8)895 2565 744 

57.5 

(1)193 9895 490 

-.(5)961 2722 631 

.(8)857 3219 737 

58 0 

,(1)192 3311 130 

-.(5)936 8295 813 

.(8)821 3000 421 

58,5 

.(1)190 7007 963 

- (5)913 2086 499 

(8)787 0792 070 

59.0 

.(1)189 0978 896 

-.(5)890 3752 219 

.(8)754 5552 728 

59.5 

,(1)187 5217 074 

-.(5)868 2967 491 

.(8)723 6308 764 

60.0 

.(1)185 9715 868 

-. (5)846 9422 843 

.(8)694 2149 871 

60.5 

.(1)184 4468 866 

- (5)826 2823 903 

. (8)666 2224 473 

61.0 

,(1)182 9469 865 

-.(5)806 2890 546 

.(8)639 5735 494 

61.5 

.(1)181 4712 863 

-.(5)786 9356 098 

.(8)614 1936 467 

62.0 i 

.(1)1800192049 

-. (5)768 1966 583 

.(8)590 0127 944 

62.5 

.(1)178 5901 798 

-.(5)750 0480 031 

.(8)566 9654 196 

63.0 

.(1)177 1836 660 

-.(5)732 4665 812 

.(8)544 9900 158 

63 5 

.(1)175 7991 358 

-.(5)715 4304 029 

.(8)524 0288 613 

64.0 

.(1)174 4360 776 

-.(5)698 9184 935 

(8)504 0277 597 

64 5 

.(1)173 0939 958 

-.(5)682 9108 392 

(8)484 9357 992 

65.0 

.(1)171 7724 099 

-.(5)667 3883 359 

, (8)466 7051 300 

65.5 

.(1)170 4708 538 

-.(5)652 3327 419 

.(8)449 2907 595 

66,0 

.(1)169 1888 755 

-.(5)637 7266 321 

,(8)432 6503 610 

66.5 

.(1)167 9260 366 

-.(5)623 5533 562 

.(8)416 7440 977 

67.0 

.(1)166 6819 116 

-,(5)609 7969 986 

.(8)401 5344 591 

67.5 

.(1)165 4560 875 

-.(5)596 4423 407 

.(8)386 9861 091 

68.0 

,(1)164 2481 635 

-.(5)583 4748 260 

. (8)373 0657 455 

68,5 

(1)163 0577 503 

-.(5)570 8805 261 

.(8)359 7419 689 

69 0 

" (1)161 8844 697 

-,(5)558 6461 100 

,(8)346 9851 615 

69.5 

.(1)160 7279 547 

-.(5)546 7588 138 

.(8)334 7673 741 

70,0 

.(1)159 5878 482 

- . (5)535 2064 131 

.(8)323 0622 212 

70.5 

.(1)158 4638 037 

-.(5)523 9771 965 

.(8)311 8447 829 

71.0 

,(1)157 3554 838 

-.(5)513 0599 408 

.(8)301 0915 145 

71.5 

.(1)156 2625 611 

-.(5)502 44.38 871 

. (8)290 7801 613 

72 0 

.(1)155 1847 168 

-.(5)492 1187 187 

. (8)280 8896 796 

72 5 

.(1)154 1216 409 

-.(5)482 0745 403 

.(8)271 4001 634 

73.0 

.(1)153 0730 320 

-.(5)472 3018 576 

.(8)262 2927 754 

73 5 

.(1)152 0385 968 

-.(5)462 7915 587 

. (8)253 5496 829 

74.0 

.(1)151 0180 498 

-.(5)453 5348 963 

.(8)245 1539 980 

74.5 

.(1)150 0111 131 

- (5)444 5234 708 

.(8)237 0897 218 

75,0 

.(1)149 0175 162 

-,(5)435 7492 141 

.(8)229 3416 916 


H. r. DAVIS 


179 






TABLE II — (Continued) 

I ~3 


(T 


75.5 

76.0 

76.5 

77.0 

77. 5 

78.0 

78. 5 

79.0 

79.5 
80 0 


.(1)148 0369 959 
.(1)147 0692 956 
.(1)146 1141 654 
.(1)145 1713 622 
.(1)144 2406 486 
.(1)143 3217 937 
.(1)142 4145 722 
.(1)141 5187 645 
.(1)140 6341 567 
.(1)139 7605 399 


(5)427 2043 746 
-.(5)418 8815 026 
-.(5)410 7734 371 
-.(5)402 8732 923 
-.(5)395 1744 458 
-.(5)387 6705 266 
-.(5)380 3554 041 
-.(5)373 2231 778 
-.(5)366 2681 669 
-.(5)359 4849 013 


. (8)221 8955 328 
.(8)214 7376 124 
. (8)207 8549 966 
.(8)201 2354 107 
.(8)194 8672 015 
.(8)188 7393 021 
.(8)182 8411 989 
.(8)177 1629 008 
.(8)171 6949 101 
(8)166 4281 950 


80.5 
81.0 

81.5 
82.0 

82.5 

83.0 

83.5 

84.0 

84.5 

85.0 


.(1)138 8977 106 
.(1)138 0454 701 
.(1)137 2036 248 
.(1)136 3719 855 
.(1)135 5503 678 
.(1)134 7385 917 
.(1)133 9364 812 
.(1)133 1438 649 
.(1)132 3605 750 
.(1)131 5864 481 


-.(5)352 8681 120 
-.(5)346 4127 230 
-.(5)340 1138 429 
-.(5)333 9667 569 
-.(5)327 9669 199 
- . (5)322 1099 490 
-.(5)316 3916 169 
-.(5)310 8078 455 
-.(5)305 3547 000 
-.(5)300 0283 827 


. (8) 161 3541 647 
,(8)156 4646 446 
. (8)151 7518 541 
.(8)147 2083 854 
.(8)142 8271 834 
. (8)138 6015 271 
.(8)134 5250 116 
.(8)130 5915 317 
.(8)126 7952 663 
.(8)123 1306 632 


85.5 
86.0 

86.5 

87.0 

87.5 

88.0 

88.5 

89.0 

89.5 

90.0 


, (1)1.30 8213 241 
.(1)130 0650 470 
.(1)129 3174 642 
.(1)128 5784 266 
.(1)127 8477 885 
.(1)127 1254 075 
.(1)126 4111 445 
.(1)125 7048 632 
.(1)125 0064 308 
.(1)124 3157 171 


-.(5)294 8252 276 
-.(5)289 7416 953 
-. (5)284 7743 676 
- . (5)279 9199 429 
-.(5)275 1752 316 
-.(5)270 5371 515 
-.(5)266 0027 239 
-.(5)261 5690 691 
-.(5)257 2334 033 
-.(5)252 9930 341 


.(8)119 5924 258 
.(8)116 1754 993 
.(8)112 8750 590 
. (8)109 6864 980 
.(8)106 6054 166 
.(8)103 6276 117 
. (8)100 7490 669 
. (9)979 6594 350 
. (9)952 7457 143 
. (9)926 7144 106 


90.5 

91.0 

91. 5 

92.0 

92.5 

93.0 

93.5 

94.0 

94.5 

95.0 


.(1)123 6325 948 
.(1)122 9569 394 
.(1)122 2886 291 
.(1)121 6275 450 
.(1)120 9735 702 
.(1)120 3265 909 
.(1)119 6864 953 
.(1)119 0531 742 
.(1)118 4265 205 
.(1)117 8064 296 


-.(5)248 8453 575 
-. (5)244 7878 546 
-. (5)240 8180 879 
-.(5)236 9336 988 
-.(5)233 1324 043 
-.(5)229 4119 941 
-.(5)225 7703 284 
-.(5)222 2053 346 
-.(5)218 7150 056 
-.(5)215 2973 968 


.(9)901 5319 538 
. (9)877 1662 253 
.(9)853 5864 881 
.(9)830 7633 199 
.(9)808 6685 508 
.(9)787 2752 029 
.(9)766 5574 344 
. (9)746 4904 858 
.(9)727 0506 294 
.(9)708 2151 209 


95.0 

96.0 
96. S 

97.0 

97.5 

98.0 

98.5 

99.0 

99.5 

100.0 


.(1)117 1927 988 
.(1)116 5855 277 
1)115 9845 180 
1)115 3896 733 
.(1)114 8008 993 
.(1)114 2181 034 
.(1)113 6411 951 
.(1)113 0700 856 
.(1)112 5046 880 
.(1)111 9449 168 


-.(5)211 9506 240 
-.(5)208 6728 615 
-.(5)205 4623 396, 
-.(5)202 3173 428 
-.(5)199 2362 08l 
-.(5)196 2173 225 
-.(5)193 2591 218 
-.(5)190 3600 889 
-.(5)187 5187 519 
-■(5H84 7336 824 


. (9)689 9621 539 
. (9)672 2708 166 
.(9)655 1210 509 
.(9)638 4936 130 
.(9)622 3700 369 
. (9)606 7325 988 
.(9)591 5642 838 
. (9)576 8487 544 
.(9)562 5703 199 
.(9)548 7139 081 






180 


POJ^^KO^^IAJ. APPROXIMATION 


TABLE III 

(The numbeis in paicntheses denote tlie number of ciphers between the 
decim al point and the first Mgnihcant figure.) 


2,534 7222 222 
,902 7777 778 
,450 6999 559 
,262 5661 376 
.167 8541 366 
.114 3378 227 
.(1)816 0531 598 
.(1)603 7943 538 

.(1)459 7794 181 
.(1)358 4609 835 
.(1)285 0269 624 
.(1)230 4589 927 
.(1)189 0399 941 
.(1)157 0204 105 
.(1)131 8685 978 
.(1)111 8316 811 
.(2)956 6897 397 
,(2)824 8507 009 


.(2)716 i 
.(2)625 < 
.(2)550 ] 
.(2)486 : 
.(2)431 1 
.(2)385 : 
.(2)345 
.(2)310 - 
.(2)280 : 
.(2)253 

. (2)230 
.( 2)210 
.(2)192 
.(2)176 
.(2)161 

(2) 148 
.(2)137 
.(2)127 
.(2)117 
.(2)109 

.( 2)101 

(3) 947 
,(3)884 
.(3)826 
.(3)773 
.(3)725 
,(3)680 
.(3)639 
.(3)601 
.(3)567 


2246 443 
9079 085 
1917 791 
2354 500 
8375 890 
2742 263 
1820 327 
4731 565 
2723 203 
8698 342 

6861 266 
2447 093 
1513 833 
0781 076 
7503 875 
9373 393 
4438 092 
1040 798 
7768 137 
3409 664 

6924 641 
4149 003 
1025 513 
3115 900 
4526 546 
0103 342 
) 5325 601 
) 6217 018 
. 9270 658 
^ 1385 543 


- 1.138 ^ 
-.236 1 

.(1)779 c 
.(1)324 C 

.(i)iss 

.(2)827 : 
,(2)474 i 
. (2)288 1 

.(2)183 ^ 
.( 2)121 ^ 
.(3)829 \ 
,(3)583 ( 
,(3)419 I 
.(3)308 
.(3)230 , 
.(3)175 : 
,(3)135 
.(3)105 < 

.(4)835 ' 

. (4)t/;7 . 

.(4)538 

.(4)438 

.(4)359 

.(4)297 

,(4)247 

.(4)207 

.(4)175 

,(4)148 

.(4)126 

(4)108 

.(5)933 

■.(5)807 

.(5)700 

.(5)610 

■.(5)534 

'.(5)469 

-,(5)413 

.(5)365 

' .(5)323 
■ .(5)287 
-.(5)256 
« .(5)228 
-.(5)205 
-.(5)184 
-.(5)165 
-.(5)149 
-.(5)135 

- .(5)122 


8888 889 
nil 411 
3209 877 
0740 741 
7239 057 
7216 611 
2942 243 
1377 881 


.555 S5SS 556 
.(1)694 4444 444 
.(1)154 3209 877 
. (2)462 9629 630 
.(2)168 3501 684 
.(3)701 4590 348 
,(3)323 7503 238 
,(3)161 8751 619 


4585 167 
4063 714 
8482 563 
0679 850 
5222 848 
1642 014 
5259 336 
2561 737 
1741 492 
6201 476 

0091 302 
2071 889 
3326 639 
2359 455 
6848 299 
4533 626 
7164 138 
6407 573 
1046 701 
5029 580 

6096 151 
4799 077 
7977 791 
3440 736 
9036 937 
8752 239 
3795 459 
1008 114 
1653 981 
0491 008 

5054 757 

■ 5101 S21 
2172 813 
9252 750 

: 0496 990 

■ 1017 105 
! 6708 183 
' 4110 299 
; 0296 678 
! 2783 008 


.(4)863 ; 
.(4)485 I 
(4)285 i 
,(4)174 , 
.(4)110 : 
.(5)716 > 
.(5)477 
.(5)325 
,(5)226 
.(5)160 

.(5)115 

.(6)844 

.(6)625 

.(6)469 

.(6)s35S 

.(6)272 

.( 6)211 

.(6)165 

.(6)130 

.(6)103 

.(7)826 

,(7)665 

.(7)539 

,(7)440 

.(7)361 

.(7)297 

.(7)247 

.(7)205 

,(7)172 

(7)144 

.(7)122 

(7) 103 

(8) 883 
,(8)754 
.(8)646 
.(8)556 

(8)479 

.(8)415 

,(8)360 

.(8)313 


3341 967 
6254 856 
6620 504 
5712 530 
2555 282 
6609 334 
7739 556 
7549 697 
6121 528 
5169 416 

5721 980 
5660 620 
6044 903 
2033 678 
9473 824 
8929 932 
2719 947 
0562 459 
0443 149 
2704 8S4 

1638 832 
5209 059 
6115 453 
2094 185 
1974 716 
9879 141 
1119 288 
9266 073 
4036 712 
9758 144 

4240 211 
7942 787 
3555 638 
5328 774 
7424 663 
1985 210 
8575 476 
2613 392 
4155 020 
6949 739 


//. r. DAJIS 


181 


T AbLli: III— (Continue d) 




S' 


C' 


25 5 
2 <> 0 

26 5 

27 0 

27.5 
28.0 

28.5 
29 0 

29.5 
30.0 


(3)5.34 9812 832 
.(3)505 2110 028 
(3)477 6101 850 
.(3)451 9846 731 
.(3)428 1608 021 
. (3)405 9829 189 
.(3)385 3112 389 
.(3)366 0199 892 
.(3)347 9957 969 
.(3)331 1362 854 


-.(5)110 9453 570 
-,(5)100 8500 824 
-.(6)918 3758 072 
-.(6)837 7472 451 
-.(6)765 4677 208 
-.(6)700 5455 924 
-.(6)642 1215 945 
-.(6)589 4492 824 
-.(6)541 8786 342 
-.(6)498 8422 596 


.(8)273 7701 591 
. (8)239 54HB 892 
.(8)210 1.106 046 
. (8)184 7701) 144 
.(8)162 mi 482 
.(8)143 8491 976 
.(8)127 3419 126 
,(8)112 9645 999 
,(8)100 4129 777 
. (9)894 3030 827 


30.5 

31.0 

31.5 

32.0 

32.5 

33.0 

33.5 

34.0 

34.5 

35.0 


.(3)315 3488 491 
.(3)300 5495 828 
.(3)286 6623 418 
.(3)273 6179 176 
.(3)261 3533 111 
.(3)249 8110 923 
.(3)238 9388 342 
.(3)228 6886 114 
.(3)219 0165 553 
.(3)209 8824 591 


.(6)459 
.(6)424 
.(6)392 
.(6)362 
,(6)336 
.(6)311 
,(6)289 
.(6)269 
. (6)250 
. (6)233 


8437 659 
4479 170 
2722 841 
9801 450 
2744 286 
8925 371 
6019 105 
1962 143 
4920 591 
3261 690 


.(9)797 9935 200 
,(9)713 3578 436 
.(9)638 8279 197 
.(9)573 0662 221 
.(9)514 9290 691 
.(9)463 4361 622 
,(9)417 7452 730 
.(9)377 1311 492 
.(9)340 9678 883 
.(9)308 7141 691 


35.5 

36.0 

36.5 

37.0 

37.5 
38 0 
38 S 

39.0 

39.5 

40.0 


.(3)201 2494 260 
.(3)193 0835 554 
.(3)185 3536 630 
.(3)178 0310 300 
.(3)171 0891 785 
.(3)164 5036 70S 
.(3)158 2519 263 
.(3)152 3130 623 
.(3)146 6677 431 
.(3)141 2980 508 


-.(6)217 5529 331 
-.(6)203 0422 839 
-.(6)189 6778 555 
-.(6)177 3553 804 
-.(6)165 9829 799 
-.(6)155 4715 079 
(6^145 7503 555 
-.(6)136 7496 434 
-.(6)128 4078 366 
-.(6)120 6693 349 


. (9)279 9008 467 
.(9)254 1205 055 
,(9)231 0186 414 
.(9)210 2861 992 
.(9)191 6532 449 
,(9)174 8835 859 
,(9)159 7701 89^ 
.(9)146 1312 710 
.(9)133 8069 469 
,(9)122 6563 680 


40.5 

41.0 

41.5 

42.0 

42.5 

43.0 

43.5 
44 0 

44.5 

45.0 


.(3)136 1873 638 
.(3)131 3202 493 
.(3)126 6823 653 
.(3)122 2603 712 
.(3)118 0418 479 
.(3)114 0152 237 
.(3)110 1697 079 
.(3)106 4952 300 
.(3:) 102 9823 841 
.(4)996 2237 840 


-.(6)113 4838 362 
-.(6)106 8057 759 
-.(6)100 5938 300 
-.(7)948 1047 667 
-.(7)894 2160 708 
-.(7)843 9617 986 
-.(7)797 0591 436 
-.(7)753 2501 737 
-.(7)712 2993 971 
-.(7)673 991S 889 


.(9)112 5552 554 
.(9)103 3937 811 
,(10)950 7474 123 
.(10)875 1197 773 
.(10)806 2901 319 
.(10)743 5786 772 
. (10)686 3803 174 
.(10)634 1557 280 
.(10)586 4235 765 
.(10)542 7537 357 


45.5 

46.0 

46.5 

47.0 

47.5 

48.0 

48.5 

49.0 

49.5 

50.0 


.(4)964 0698 884 
.(4)933 2851 680 
.(4)903 7975 033 
.(4)875 5392 866 
.(4)848 4470 959 
.(4)822 4613 955 
. (4)797 5262 609 
.(4)773 5891 262 
.(4)750 6005 505 
(4)728 5140 038 


-.(7)638 1298 500 
-.(7)604 5338 699 
-.(7)573 0383 707 
-.(7)543 4917 120 
-,(7)515 7546 374 
-.(7)489 6991 482 
-.(7)465 2074 902 
- (7)442 1712 397 
(7)420 4904 806 
-,(7)400 0730 601 


.(10)502 7613 SSI 
.(10)466 1016 730 
.(10)432 4654 698 
.(10)401 5750 791 
.(10)373 1808 816 
.(10)347 0582 199 
.(10)323 0046 799 
,(10)300 8376 920 
.(10)280 3924 120 
.(10)261 5198 458 





182 


POLYNOMIAL APPROXIMATION 


TABLE IV 


(The numbers in parentheses denote ilie number of ciphers between the 
decnnal point and the first significant figure ) 



Ai' 

S 

c 

2,0 

1.000 0000 000 

- \ 250 0000 OOQ 

250 0000 000 

2.5 

.70S 993(5 523 

495 6054 688 

.(1)615 2343 750 

5.0 

.567 0995 671 

,265 ISIS 152 

(1)227 2727 273 

3.5 

.479 4D06 348 

162 3535 156 

.(1)102 5390 625 

4.0 

.417 2494 172 

.107 8088 578 

.(2)524 4755 245 

4.S 

.370 3002 930 

-.(1)756 8359 375 

.(2)292 9687 500 

5,0 

.333 3333 333 

-. (1)553 6130 536 

. (2)174 8251 748 

5 5 

.303 3523 560 

-. (1)418 0908 203 

.(2)109 8632 813 

6.0 

.278 4862 197 

-.(1)323 9407 651 

.(3)719 8683 669 

6.5 

.257 4942 453 

-.(1)256 3476 563 

.(3)488 2812 500 

7.0 

.239 5159 021 

- (1)206 4885 579 

. (3)340 9902 791 

7.5 

.223 9329 020 

-,(1)168 8639 323 

. (3)244 1406 250 

8.0 

.210 2881 638 

-.(1)139 9142 653 

.(3)178 6139 557 

8.5 

.198 2357 141 

-.(1)117 2614 820 

(3)133 1676 136 

9,0 

.187 5084 130 

-.(2)992 7311 886 

.(3)100 9557 141 

9.5 

.177 8964 418 

-.(2)848 0187 618 

(4)776 8110 795 

10.0 

169 2325 443 

-.(2)730 2463 319 

. (4)605 7342 846 

10.5 

.161 3816 481 i 

- (2)633 3998 033 

. (4M78 0375 874 

11.0 

.1.54 2334 096 

-.(2)553 0129 672 

,(4)381 3882 532 

11.5 

^ .147 6967 551 

“.(2)485 7203 344 

.(4)307 3098 776 

12.0 

.141 6958 188 

-.(2)428 9521 906 

(4)249 8750 625 

12.5 

.136 1668 726 

(2)380 7227 928 

.(4)204 8732 517 

13 0 

.131 0559 774 

-.(2)339 4807 972 

.(4)169 2702 036 

13. S 

.126 3171 605 

-.(2)304 0020 282 

,(4)140 8503 606 

14 0 

.121 9109 898 

-.(2)273 3115 3SS 

.(4)117 9762 025 

14 5 

.117 8034 442 

- (2)246 6262 196 

.(5)994 2378 394 

15.0 

.113 9650 116 

-.(2)223 3120 976 

. (5)842 6871 608 

15.5 

110 3699 628 

-.(2)202 8521 369 

,(5)718 0606 618 

16.0 

.106 9957 611 

- (2)184 8217 922 

.(5 ) 614 9338 741 

16.5 

.103 8225 802 

-.(2)168 8702 311 

.(5)529 0973 297 

17 0 

.100 8329 073 

-.(2)154 7057 999 

.(5)457 2585 218 

17.5 

.(1)980 1121 368 

“.(2)142 0846 788 

.(5)396 8229 973 

18,0 

,(1)953 4368 071 

-.(2)130 8019 601 

.(5)345 7320 530 

18.5 

.(1)928 1796 986 

- (2)120 6845 814 

.(5)302 3413 313 

19.0 

.(1)904 2302 916 

-.(2)111 5856 901 

(5)265 3292 500 

19.5 

.(1)881 4892 929 

-.(2)103 3801 211 

.(5)233 6273 923 

20.0 

,(1)859 8672 410 

“.(3)959 6074 542 

.(5)206 3671 945 

20.5 

.(1)839 2833 163 

“.(3)892 3550 616 

.(5)182 8388 288 

21 0 

.(1)819 6643 189 

-.(3)831 2499 865 

(5)162 4592 807 

21.5 

.(1)800 9437 893 

“.(3)775 6048 512 

.(5)144 7474 061 

22,0 

.(1)783 0612 483 

- (3)724 8225 801 

.(5)129 3043 255 

22.5 

.(1)765 9615 377 

-.(3)678 3828 434 

.(S)llS 7979 249 

23.0 

.(1)749 5942 464 

-.(3)635 8307 796 

.(5)103 9505 362 

23 5 

.(1)733 9132 094 

“.(3)596 7675 752 

. (6)935 2909 319 

24.0 

.(1)718 8760 691 

-.(3)560 8425 626 

< (6)843 3722 746 

24,5 

.(1)704 4438 900 

-.(3)527 7465 684 

.(6)762 0889 07S 

25.0 

,(1)690 5808 198 

-.(3)497 2062 910 

. (6)690 0318 611 





IL T DAVIS 


183 


TABLE IV- 



-(Continued) 




*538 : 
.(1)824 ( 
.( 1)211 ^ 
.(2)706 ; 
.(2)279 ; 
.(2)124 ^ 
.(3)607 ( 

.(3)317 ! 
.(3)176 . 
.(3)102 ( 
, (4)622 ' 
. (4)390 , 
. (4)252 
. (4X167 
. (4)113 
■.(5)788 
. (5)556 

. (5)399 
. (5)291 
.(5)215 
■.(5)161 
.(5)122 
.(6)936 
■. (6)723 
■.(6)564 
. (6)444 
.(6)352 

.(6)282 
. (6)227 
. (6)183 
■.(6)150 
>.(6)123 
..( 6)101 
-.(7)840 
-.(7)700 
-,(7)586 
-.(7)492 

-.(7)415 
V (7)352 
..(7)299 
-.(7)256 
(7)219 
^.(7)188 
-.(7)162 
-.(7)140 
-.( 7)122 
-.(7)106 


1944 444 
6527 778 
4898 990 
2815 657 
2346 542 
4415 307 
0318 570 


9329 351 
3963 161 
6597 994 
0667 018 
2012 053 
2095 208 
3566 157 
6601 579 
0526 136 
6161 004 

7797 90S 
5234 609 
5402 971 
3897 791 
2557 913 
0842 049 
8747 984 
9425 143 
6919 780 
8408 251 

0575 132 
0555 132 
9826 864 
0043 633 
0152 522 
4381 253 
8165 075 
3899 143 
1479 930 
7218 S58 

9387 633 
5353 142 
9440 718 
1336 780 
4882 245 
7156 473 
: 7777 261 
8362 750 
: 2115 427 
1 3498 773 


.121 5 
.( 1)121 5 
.( 2)220 ^ 
.{3)552 
.(3)169 S 
.(4)607 ( 
.(4)242 h 

.(4)106 : 

.(5)499 

,(5)249 

.(5)131 

.(6)723 

.(6)413 

.(6)244 

.(6)148 

.(7)929 

.(7)594 

.(7)388 

.(7)259 

.(7)175 

.(7)121 

.(8)849 

.(8)602 

(8)433 

.{8)315 

.(8)231 

.(8)172 

.(8)129 
.(9)976 
,(9)745 
.(9)573 
. (9)444 
.(9)346 
.(9)272 
.(9)215 
.(9)171 
.(9)137 

.(9)110 

.(10)891 

.(10)724 

,(10)591 

.(10)485 

.(10)399 

.(10)330 

.(10)274 

.(10)228 

.(10)191 


5277 778 
5277 778 
9595 960 
3989 899 
9689 200 
0318 570 
8127 428 

2305 750 
9085 881 
9542 941 
5548 916 
5519 039 
4582 308 
3162 273 
7142 253 
4639 082 
8569 012 

9448 970 
2965 980 
9512 629 
3456 986 
4198 899 
8141 154 
2726 455 
1073 785 
6966 018 
1174 757 

0881 067 
8829 690 
; 5159 507 
■ 4738 083 
. 4422 014 
I 8817 182 
: 5499 214 
! 5045 890 
4241 049 
' 1392 839 

► 3076 849 
, 8493 673 
^ 6276 109 
. 5327 436 
i 0568 497 
) 4585 821 
) 3215 199 
[ 2291 863 
I 5243 219 
i 1294 329 





184 


Pol. VNOMlAl APPROXIMA TION 


TABLE V 


(The numbers in parentheses denote the number of ciphers between the 
decinwl point and the first significant figure ) 


25 2.7SS 1050 8!6 - 1.695 6105 194 

30 1 170 5555 SS6 - 456 9444 444 

3.5 .058 2671 3-7 - ,181 6553 455 

4.0 . 418 6208 236 -,(1)868 9782 440 

4 5 .286 4246 ,524 -.(1)466 0481 012 

5.0 .206 0275 835 - , (1)270 7119 270 

5.5 .153 7850 411 - . (1 )I66 9293 642 

6.0 .118 1431 967 -.(1)107 8702 752 

0,5 ,(1)928 9213 334 -.(2)724 0120 171 

7.0 ,(1)744 .5226 377 -.(2)501 4851 420 

7.5 . (1 )606 4555 217 -. (2)356 7180 657 

8 0 .(1)500 8805 006 -.(2)259 6021 548 

8.5 .(1)418 6842 624 -.(2)192 7113 758 

9.0 .(1)353 6828 101 -.(2)145 5695 568 

9.5 .(1)301 5707 729 -.(2)111 6692 311 

10 0 . (1)259 2832 573 -. (3)868 5125 822 

10.5 .(1)224 5954 598 -. (3)683 8979 612 

11 0 . (1)195 8642 756 (3)544 5806 418 

11 5 .(1)171 8573 720 -.(3)438 0711 370 

12 0 ,(1)1SI 6376 163 - (3)355 6755 034 

12.5 ,(1)134 4833 894 -,(3)291 2423 953 

13.0 . (1)119 8326 681 -. (3)240 3543 455 

13 5 .(1)107 2431 572 -.(3)199 7956 319 

14.0 (2)963 6345 096 -. (3)167 1959 380 

14.5 .(2)869 1189 432 - (3)140 7878 615 

ISO .(2)786 6089 931 - (3)119 2394 389 

15. 5 . (2)714 2517 761 -. (3)101 5368 865 

16 0 . (2)650 5281 970 -. (4)869 0145 535 

16.5 .(2)594 1846 661 (4)747 2977 719 

17 0 .(2)544 1803 654 -. (4)645 5059 265 

17.5 .(2)499 6461 807 - . (4)559 9303 930 

18.0 .(2)459 8524 699 -.(4)487 6318 743 

18.5 (2)424 1835 784 - (4)426 2652 106 

19.0 .(2)392 1175 513 -.(4)373 9474 673 

19.5 .(2)363 2098 774 -.(4)329 1578 113 

20.0 .(2)337 0803 860 -. (4)290 6609 405 

20 5 ,(2)313 4026 161 -.(4)257 4481 065 

21.0 (2)291 8951 572 -,(4)228 6913 803 

21.5 .(2)272 2750 483 - (4)203 7079 595 

22.0 .(2)254 4494 241 - (4)181 9321 409 

22.5 .(2)238 1156 895 -. (4)162 8931 866 

23 0 .(2)223 1524 850 -.(4)146 1977 446 

23 5 (2)209 4188 368 -.(4)131 5158 166 

24 0 .(2)196 7908 418 -.(4)118 5694 971 

24 5 . (2)186 2576 776 -. (4)107 7593 614 

25.0 .(2)174 4277 221 -.(5)969 7978 581 


.200 8159 722 
.(1)363 8888 889 
(1)104 8944 979 
,(2)382 4786 325 
.(2)162 0459 402 
.(3)763 8888 889 

.(3)390 4384 270 
.(3)212 7325 289 
.(3)122 1004 174 
,(4)731 8541 452 

(4) 455 0831 382 
. (4)292 0668 942 
. (4)192 6724 344 
, (4)130 2141 757 
. (5)899 1006 061 
.(5)632 8140 010 

.(5)453 1298 476 
. (5)329 5585 675 

(5) 243 1030 464 
. (5)181 6613 061 
, (5)137 3671 277 
.(5)105 0128 024 
. (6)810 9219 640 
, (6)632 0799 809 
. (6)496 9749 385 
.(6)393 9212 946 

, (6)314 6050 442 
. (6)253 0429 506 
. (6)204 8829 224 
,(6)166 9275 426 
. (6)136 8055 923 
. (6)112 7430 005 
(7)934 0140 025 
,(7)777 6318 298 
. (7)650 4887 492 
(7)546 5721 085 

, (7)461 2129 835 
.(7)390 7629 285 
.(7)332 3537 754 
.(7)283 7178 436 
, (7)243 0524 324 
(7)208 9170 140 
.(7)180 1547 432 
.(7)155 8321 715 
.(7)135 9946 487 
.(7)117 6202 932 






H. T. DAVIS 


185 


TABLE V — ^Continued) 


2.5 

3.0 

3.5 

4.0 

4 5 

5.0 

5 5 

6.0 

6.5 
7 0 

7.5 
8,0 

8.5 
9 0 

9.5 
10.0 

10. 5 
11.0 

11.5 
12,0 

12 5 

13.0 

13 5 

14.0 

14 5 

15 0 

15. 5 

16 0 

16.5 

17 0 

17.5 

18 0 

18.5 

19 0 

19.5 

20 0 

20 5 

21.0 

21 5 

22 0 

22.5 

23.0 

23.5 

24.0 

24.5 

25.0 


1.1.51 0416 667 - .140 

.203 1250 000 -.(1)170 

,(1)579 2905 012 -.(2)355 

.(1)210 1544 289 -.(3)988 

. (2)887 9662 005 - (3)331 

.(2)417 9414 336 -.(3)126 

.(2 ) 213 4163 324 -.(4) 538 

.(2)116 2108 929 -.(4)247 

(3) 666 7389 843 -.(4)121 

.(3)399 5246 884 -.(5)630 

.(3)248 3850 322 -.(5)343 

.(3)159 3881 480 -.(5)194 

.(3)105 1352 375 -.(5)113 

.(4)710 4822 114 -.(6)689 

.(4)490 5434 324 -.(6)428 

.(4)345 2433 425 -.(6)273 

.(4)247 2044 428 -.(6)177 

(4) 179 7851 483 -.(6)118 

.(4)132 6177 560 -.(7)799 

.(5)990 9817'818 -.(7)549 

.(5)749 3410 036 -.(7)384 

.(5)572 8402 600 -.(7)272 

.(5)442 3498 419 -.(7)195 

.(5)344 7903 612 -.(7)141 

.(5)271 0905 849 -.(7)104 

.(5)214 8754 286 -.(8)771 

.(5)171 6092 639 -.(8)578 

.(5)138 0280 549 -.(8)437 

,(5)111 7576 435 -.(8)333 

.(6)910 5379 546 -.(8)256 

,(6)746 2299 456 -.(8)198 

.(6)614 9748 289 -.(8)154 

.(6)509 4718 636 -.(8)121 

.(6)424 1700 765 -.(9)959 

(6)354 8175 306 -.(9)762 

.(6)298 1344 148 -.(9)609 

.(6)251 5739 152 -.(9)490 

.(6)213 1458 799 -.(9)396 

.(6)181 2857 886 -.(9)321 

.(6)154 7566 781 -.(9)262 

.(6)132 5752 232 -.(9)215 

.(6)113 9556 568 -.(9)177 

.(7)982 6695 408 -.(9)146 

. (7)849 9994 800 -.(9)121 

.(7)741 7936 742 -.(9)101 

.(7)641 5689 752 •r(10)846 


9722 222 ,(1)175 0000 000 
1388 889 .(2)145 8333 333 
2350 427 .(3)224 3589 744 
2478 632 .(4)480 7692 308 
1965 812 .(4)128 2051 282 
8696 S82 .(5)400 6410 256 

1158 874 .(5)141 4027 149 
4547 511 . (6) 549 8994 470 
5567 199 .(6)231 5366 092 
9372 602 .(6)104 1914 742 
1703 316 .(7)496 1498 770 
3253 685 .(7 ) 248 0749 385 
9706 601 .(7)129 4304 027 
3966 587 ,(8)701 0813 479 
5943 973 .(8)392 6055 548 
0621 968 ,(8)226 5032 047 

8469 607 ,(80134 2241 213 
1651 639 .(9)814 9321 650 
4765 550 .(9 ) 505 8199 645 
9387 059 .(9)320 3526 442 
0787 078 , (9)206 6791 253 
0198 696 .(9)135 6331 760 
1610 699 .(10)904 2211 731 
7056 417 ,(10)611 6790 289 
0436 901 .(10)419 4370 484 
8806 793 .(10)291 2757 280 

2216 817 .(10)204 6802 413 
1000 417 .(10)145 4306 978 
2476 075 .(10)104 4117 830 
1134 028 .{11)756 9854 269 
3240 275 .(11)553 8917 758 
6720 805 .(11)408 8248 821 
4431 743 .(11)304 2417 727 
6292 582 .(11)228 1813 296 
8862 452 .(11)172 4036 712 
9716 846 .(11)131 1767 064 

3776 023 .(11)100 4757 751 
2862 254 .(12)774 5007 663 
8393 ISO .(12)600 6332 473 
6168 770 .(12)468 4939 329 
2622 450 .(12)367 4462 219 
2103 622 .(12)289 7172 134 
4894 049 ,(12)229 5872 257 
5749 393 ,(12)182 8194 575 
8827 904 .(12)147 1231 630 
8458 443 .(12)117 5267 941 





186 


POLYNOMIAL APPROXIMATION 


TABLE VI 


(The numbers in parentheses denote the number of ciphers 
between the decimal point and the first significant figure.) 


p 

A 


3.0 

1,000 0000 000 

-1.361 1111 111 

3 5 

.745 3327 179 

- 617 5664 266 

4.0 

619 2696 193 

-.362 6910 127 

4 5 

,536 5078 449 

- 238 2141 749 

5 0 

.475 9358 289 

-.167 1854 290 

5.5 

.428 9313 952 

-.122 7793 517 

6 0 

.391 0671 373 

(1)932 5031 693 

6 S 

.359 7514 629 

- (1)727 0467 546 

7.0 

,333 3333 333 

-.(1)578 9958 809 

7.S 

.310 6966 019 

-.(1)469 2645 603 

8 0 

.291 0527 351 

-.(1)386 0218 617 

8 S 

.273 8253 011 

-.(1)321 6238 375 

9.0 

258 5812 357 

-.(1)270 9606 497 

9.5 

244 9877 912 

-.(1)230 5173 392 

10.0 

.232 7844 334 

1 - (1)197 8161 991 

10 5 

1 .221 7638 626 

-,(1)171 0732 926 

ILO 

.211 7588 265 

-,(1)148 9801 195 

11,5 

.202 6327 307 j 

-.(1)130 5610 007 

12 0 

.194 2728 127 

-.(1)115 0776 715 

12 S 

, 186 5850 870 , 

-.(1)101 9640 947 

13.0 ! 

.179 4905 415 ' 

- (2)907 8107 302 

13.5 

,172 9222 326 

- (2)811 8410 815 

14.0 

.166 8230 401 

-.(2)729 0028 821 

14.5 

161 1439 098 

-.(2)657 1143 441 

15.0 

. 155 8424 640 

-.(2)594 4165 421 

IS. 5 

.150 8818 928 

(2)539 4804 055 

16.0 

146 2300 606 

-.(2)491 1364 663 

16. S 

.141 8587 811 

-.(2)448 4212 624 

17,0 

. 137 7432 239 

-.(2)410 5360 674 

17.5 

133 8614 260 

-.(2)376 8148 358 

18 0 

.130 1938 866 

-.(2)346 6991 057 

18. S 

126 7232 293 

-.(2)319 7181 992 

19.0 

.123 4339 187 

-.(2)295 4734 917 

19.5 

.120 3120 213 

-.(2)273 6258 312 

20.0 

,117 3450 034 

-.(2)253 8854 120 

20 5 

.114 5607 471 

-.(2)236 0035 778 

21.0 

.111 8314 598 

-.(2)219 7661 473 

21.5 

.109 2654 341 

-,(2)204 9879 526 

22.0 

.106 8150 524 

-.(2)191 5083 451 

22.5 

,104 4726 355 

-.(2)179 1874 832 

23.0 

.102 2311 723 

-.(2)167 9032 479 

23.5 

.100 0842 483 

- (2)157 5486 730 

24.0 

.(1)980 2598 324 

-.(2)148 0297 913 

24.5 

,(1)960 5097 234 

(2)139 2638 202 

25,0 

,(1)941 5425 829 

-.(2)131 1776 633 





H. T. DAVIS 


187 


TABLE! VI — (Continued) 


/O 

(T 


3.0 

.388 8888 889 

- (1)277 7777 778 

3.5 

.116 7093 913 

-.(2)581 8684 896 

4.0 

.(f)498 S7S4 986 

-.(2)1851851852 

4.5 

.(1)251 6301 473 

-.(3)727 3356 120 

5.0 

.(1)140 7742 584 

' ’-,(3)326 7973 856 

5.5 

.{2)846 3824 237 

-.(3)161 6301 360 

6.0 

, (2)537 1649 335 

-.(4)859 9931 201 

6.5 

.(2)355 7417 128 

- (4)484 8904 080 

7.0 

.(2)243 8852 284 

-.(4)286 6643 734 

7 5 

.(2)172 0852 322 

-.(4)176 3237 847 

8 0 

.(2)124 4257 604 

- (4)112 1730 157 

8.5 

9 0 

(3)918 7769 007 
. (3)690 9857 765 

-.(5)734 6824 363 
(5)493 5612 689 

9.5 

.(3)528 1236 437 

-.(5)339 0842 014 

10.0 

. (3)409 4730 527 

(5)237 6406 110 

10.5 

.(3)321 5757 191 

-.(5)169 6421 007 

11,0 

. (3)255 4794 154 

-.(5)122 9175 574 

11 s 

. (3)205 1024 695 

-.(6)904 2245 370 

12.0 

. (3)166 2345 S86 

-.(6)674 0640 244 

12.5 

.(3)135 9108 167 

-.(6)508 6263 021 

13.0 

. (3)112 0109 002 

- (6)388 0974 686 

13,5 

. (4)929 9691 087 

-.(6)299 1919 424 

14 0 

.(4)777 3890 832 

-.(6)232 8584 812 

14.5 

. (4)653 9677 383 

-.(6)182 8395 204 

15.0 

. (4)513 3898 749 

-. (6)144 7498 667 

15.5 

, (4)470 8598 806 

-.(6)115 4775 918 

16.0 

. (4 402 7015 521 

-.(7)927 8837 607 

16 5 

. (4)346 0719 079 

-.(7)750 6043 467 

17,0 

. (4)298 7541 988 

-. (7)611 0454 032 

17,5 

. (4)259 0066 153 

-.(7)500 4028 978 

18.0 

. (4)225 4506 124 

-.(7)412 1003 883 

18,5 

.(4)196 9877 124 

-,(7)341 1837 940 

19,0 

.(4)172 7369 850 

-,(7)283 8913 786 

19.5 ' 

.(4)151 9876 810 

-.(7)237 3452 480 

20.0 , 

.(4)134 1630 697 

(7) 199 3279 892 

20,5 

. (4)118 7926 279 

-.(7)168 1195 S07 

21.0 1 

.(4)105 4905 051 

-.{7)142 3771 352 

21.5 ! 

. (5)939 3873 877 

-.(7)121 0460 765 

22.0 

.(5)838 7409 116 

(7) 103 2932 157 

22.5 

.(5)750 7766 502 

-.(8)884 5674 819 

23,0 

.(5)673 6666 593 

-.(8)760 0821 534 

23 5 

. (5)605 8783 224 

-.(8)655 2351 718 

24.0 

.(5)546 1216 848 

-.(8)566 6066 961 

24.5 

. (5)493 3070 093 

-.(8)491 4263 585 

25.0 

.(5)446 5105 454 

(8)427 4401 392 





188 


POLYNOMIAL APPROXIMATION 


TABLE VI— (Continuedl 



£ 


3.0 

2,988 9351 852 

-.948 1481 481 

3.5 

.875 4725 025 

-- 189 2894 604 

4.0 

.370 6973 366 

-,(1)590 7882 241 

4.5 

.186 3394 083 

-,(1)229 8587 984 

5,0 

.104 0300 478 

-.(1)102 7515 921 

5.5 

.(1)624 7216 370 

6826 853 

6 0 

.(1)396 1994 936 

-,(2)269 0942 361 

6.5 

,(1)262 2649 076 

-..(2)151 5410 433 

7,0 

,(1)179 7450 169 

-.(3)895 1744 340 

7.5 

,(1)126 8009 428 

-.(3)550 2997 477 

8.0 

.(2)916 6922 412 

-.(3)349 9462 528 

8,5 

.(2)676 8238 027 

-.(3)229 1312 716 

9 0 

,(2)508 9783 157 

-,(3)153 8970 152 

9 5 

,(2)388 9905 578 

-.(3)105 7119 802 

10,0 

.(2)301 5839 811 

-.(4)740 7668 221 

10. S 

. (2)236 8373 742 

^ -.(4)528 4390 327 

11.0 

.(2)188 1526 456 

- (4)383 0865 153 

ILS 

,(2)151 0481 065 

-.(4)281 7940 194 

12.0 

.(2)122 4214 851 

-.(4)210 0557 062 

12 5 

.(2)100 0884 111 

-.(4)158 4944 724 

13.0 

.(3)824 6707 418 

-,(4)120 9320 228 

13.5 

, (3)684 8389 962 

- (5)932 2623 816 

14.0 

.(3)572 4725 971 

-.(5)725 S546 509 

14.5 

.(3)481 5811 670 

-.{5)569 6913 075 

15 0 

.(3)407 5132 846 

-,(5)451 0040 878 

15.5 

,(3)346 7368 814 

-.(5)359 7939 891 

16.0 

. (3)296 5444 286 

(5)289 0976 344 

16.5 

.(3)254 8421 133 

-,(5)233 8608 712 

17.0 

.(3)219 9973 645 

-.(5)190 3777 170 

17.5 

.(3)190 7274 117 

-.(5)155 9046 664 

18.0 

.(3)166 0170 278 

-,(5)128 3924 319 

18.5 

.(3)145 0572 508 

- (5)106 2973 088 

19.0 

.(3)127 1993 361 

-,(6)884 4715 492 

19.5 

.(3)111 9198 707 

-.(6)739 4524 672 

20.0, 

.(4)987 9413 925 

- (6)621 0067 139 

20.5 

.(4)874 7564 250 

-,(6)523 7749 416 

21,0 

.(4)776 8023 679 

-.(6)443 5733 082 

21.5 

.(4)691 6366 980 

-.(6)377 1157 568 

22.0 

(4)617 6240 310 

-.(6)321 8064 139 

22.5 

.(4)552 8493 126 

-.(6)275 5833 129 

23.0 

. (4)496 0674 883 

-. (6)236 7999 628 

23 5 

.(4)446 1499 SIS 

-.(6)204 1350 268 

24.0 

.(4)402 1467 932 

-.(6)176 5230 172 

24.5 

.(4)365 3958 471 

-. (6) 154 0028 718 

2S,0 

.(4)328 7959 685 

-. (6) 133 1 661 23S 




H. T, DAVIS 


TABLE VI — (Ointnuu’il) 





0 

,(1)703 2407 407 

311 ^>?12 %3 

3 5 

,(2)9% 0214 120 

,(1)331 hM 940 

4,0 

1 (2)233 6419 753 

.(1)100 2196 106 

4 5 

,(3)711 2440 363 

, (2) 10 i 4S2U 991 

5.0 

,(3)256 2(i3o 166 

(2)109 02ai ns 

5.5 

(3)104 2151 284 

, (.'^1442 3904 068 

6.0 

. (4)464 8740 588 

.(J)i97 J(W yil 

6,5 

. (4)223 1993 819 

(4J916 lb21 940 

7 0 

.(4)113 8216 820 

.(4)482 2607 431 

7.5 

,(5)610 4831 370 

(4)258 5707 479 

8.0 

.(5)341 8161 061 

.(4)144 7403 415 

8 S 

.(5)198 6271 380 

,(5)840 9223 280 

9.0 

.(5)119 2274 520 

.(5)504 7006 725 

9 5 

.(6)7.36 45S7 70S 

,(5)311 71.58 600 

10.0 

. (6)466 6351 290 

.(5)197 4943 272 

10, s 

.(6)302 4952 3.38 

.(S)I2S 0172 024 

11,0 

.(6)200 1684 248 

.(6)817 0785 482 

11.5 

.(6)134 9506 134 

.(0)571 0651 

12 0 

.(7)925 4160 570 

(6)391 5919 775 

12.5 

. (7)644 .5499 552 

.(6)272 7.556 

13 0 

(7)455 3925 745 

(61192 6912 065 

13.5 

. (7)326 0189 201 

,(6H,V 9465 9.32 

14,0 

,(7)236 2653 096 

(7)999 6815 Ifi.i 

14.5 

.(7)173 1717 709 

.(717.32 7121 (106 

15.0 

.(7)128 2726 524 

.(7)542 7324 021 

15,5 

.(8)959 5450 485 

(7)403 9879 5.30 

16,0 

(8)724 4284 089 

. (7)306 3l)6(> 664 

16.5 

, (8)551 6645 046 

.(7)23,3 4084 8,30 

17.0 1 

.(8)42.3 5236 197 

(7)179 1912 40.3 

17 5 1 

. (8)327 6400 420 

,(7)1,3,H 6225 823 

18.0 

. (8)255 2963 323 

(7)108 01,39 227 

18.5 

. (8)200 2846 930 

(8).SI7 .3860 468 

19 0 

.(8)158 1422 442 

(8)6f)'' 08.10 oOJ 

19 5 

(8)125 6316 099 

, [8 1.3, 31 5324 %2 

20.0 

(8)100 3843 545 

(8)424 7132 721 

20.5 

.(9)806 5.374 2IS 

.(8). 341 2348 327 

21.0 

(9)651 4166 128 

.(8i275 (V148 185 

21 5 

,(9)528 7036 968 

.(8)223 7117 748 

22 0 

(9)431 2S.W 387 

.(SiOHi 4,"66 123 

22.5 

.(9)353 3296 685 

.(81149 4.978 961 

23.0 

.(9)290 7474 080 

, (8) 123 0102 298 

23,5 

.(9)240 2476 147 

.(8)101 644.5 207 

24 0 

.(9)199 3121 .346 

.(9)843 2535 6^S 

24.5 

.(9)166 9628 406 

(9)706 :m> 111 

25.0 

.(9)138 7382 906 

,(9)586 9755. 711 





190 


POLYNOMIAL APPROXIMATION 


TABIvF, Vl-(Continued) 


/O 

/ 

u/ 

3,0 

-,(1)234 9537 037 

.(2)178 2407 407 

3,5 

-.(2)232 9282 407 

.(3)127 3148 148 

4,0 

-.(3)408 9506 173 

. (4) 169 7530 864 

4 S 

(4)972 9456 019 

.(5)318 2870 370 

5,0 

-.(4)282 5435 730 

(6)748 9106 754 

s s 

-.(5)947 9582 728 

(6)208 0307 432 

6,0 

-.(5)355 3443 795 

,(7)656 9391 889 

6.5 

-.(5)145 5344 260 

.(7)229 9287 161 

7.0 

- (6)641 0133 904 

.(8)875 9189 186 

7 5 

-,(6)300 1017 659 

(8)358 3304 667 

8.0 

-.(6)148 0060 623 

(8)155 7958 SSI 

8 5 

- (7)763 5619 773 

.(9)714 0643 358 

9.0 

-.(7)409 7430 989 

,(9)342 7508 812 

9.5 

-.(7)227 6566 932 | 

.(9)171 3754 406 

10 0 

-.(7)130 4646 031 

.(10)888 6133 957 

10.5 

-.(8)768 7010 766 

(10)476 0428 90S 

11 0 

-.(8)464 4029 703 

.(10)262 6443 534 

11. S 

-.(8)287 0085 966 

.(10)148 8318 003 

12 0 

-.(8)181 0859 646 

.(11)864 1846 467 

12.5 

-.(8)116 4409 022 

.(11)513 1096 340 

13 0 

-.(9)761 8900 625 

.(11)310 9755 357 

13.5 

-.(9)506 5928 672 

.(11)192 0731 250 

14 0 

-.(9)341 8901 625 

.(11)120 7316 786 

14.5 

(9)233 9443 041 

.(12)771 3412 798 

15.0 

-.(9)162 1522 356 

(12)500 3294 788 

15.5 

-.(9)113 7486 502 

,(12)329 1641 308 

16.0 

-(10)806 9508 539 

.(12)219 4427 539 

16 5 

-(10)578 5246 623 

(12)148 1238 589 

17 0 

-(10)418 8850 767 

(12)101 1577 573 

17 S 

-(10)306 1363 264 

.(13)698 4702 287 

18.0 

-(10)225 7107 282 

,(13)487 3048 107 

18.5 

-(10)167 8017 503 

(13)343 3283 894 

19,0 

-{10)125 7344 857 

■(13)244 1446 325 

19.5 

-(11)949 1786 023 

.(13)175 1472 363 

20 0 

-(11)721 6269 402 

.(13)126 7022 561 

20 5 

-(11)552 3276 496 

.(14)923 8706 171 

21.0 

-(11)425 4604 167 

.(14)678 7620 861 

21.5 

-(11)329 7379 935 

(14)502 2839 437 

22,0 

-(11)257 0422 413 

,(14)374 2507 816 

22.5 

-(11)201 4893 910 

.(14)280 6880 862 

23.0 

-(11)158 7837 578 

.(14)211 8400 650 

23.5 

-(11)125 7671 107 

.(14)160 8415 309 

24 0 

-(11)100 1019 200 

.(14)122 8244 418 

24 5 

-(12)805 1862 542 

(15)948 6730 535 

25 0 

-(12)642 9736 393 

,(15)728 0195 608 





H, T. DAVIS 


191 


TABLE VIT 


(The numbers in parentheses denote the number of ciphers be- 
tween the decimal point and the firbt significant figure ) 




S' 

3 5 

28.751 2015 275 

-2.018 4913 853 

4.0 

1,362 9280 045 

-.639 9553 S71 

4.5 

.826 9536 776 

- 286 6103 525 

S.O 

,557 2123 756 

ISO 4344 206 

5.5 

.399 2748 295 

-.(1)869 6309 477 

6 0 

298 3433 421 

-.(1)537 5591 718 

6.5 

229 9463 348 

-.(1)349 3835 704 

7.0 

.181 5696 107 

- (1)236 2100 221 

7 5 

. 146 2040 440 

: - (1)164 8961 618 

8.0 

119 6560 886 

- (1)118 2279 748 

8.5 

(1)992 8681 536 

-.(2)867 1239 S12 

9 0 

.(1)833 6752 283 

- (2)648 5413 101 

9 S 

' .(1)707 2793 855 

-.(2)493 4138 461 

10.0 

.(1)605 5349 149 

-.(2)381 0899 486 

10 5 

(1)522 6404 211 

(2)298 3079 274 

11 0 

. (10454 3790 237 

-.(2)236 3306 957 

11.5 

.(1)397 6190 688 i 

-.(2)189 2706 474 

12 0 

.(1)350 0205 660 

-.(2)1S3 0798 651 

12.5 

(1)309 7896 316 

-.(2)124 9247 269 

13.0 

.(l.)275 5429 712 

-.(2)102 7891 S76 

13,5 

,(1)246 1999 134 

-.(3)852 1744 589 

14 0 

, (1)220 9073 683 

-,(3)711 4425 518 

14.5 

(1)198 9854 583 

-.(3)597 8026 293 

15 0 

.(1)179 8875 910 

' -(3)505 3397 258 

15.5 

.(1)163 1707 725 

-.(3)429 5745 974 

16 0 

,(1)148 4732 835 

-.(3)367 0820 942 

16.5 

(1)135 4977 1S6 

- (3)315 2193 163 

17 0 

.(1)123 9979 553 

-.(3)271 9296 956 

17.5 1 

.(1)113 7691 065 

(3)235 6002 537 

18 0 

.(1)104 6396 241 

-.(3)204 9565 434 

18.5 

.(2)964 6512 317 

-,(3)178 9845 869 

19 0 

(2)891 2347 318 

-.(3)156 8723 569 

19.5 ' 

.(2)842 5297 627 

-. (3) 137 9655 374 

20 0 

.(2)765 3875 595 

-.(3)121 7338 175 

20 5 

.(2)711 3114 773 

-,(3)107 7450 175 

21.0 

.(2)662 2270 210 

-. (4)956 4508 777 

21 5 

(2)617 5695 829 

- (4)851 4293 652 

22.0 

.(2)576 8498 581 

(4)759 9627 856 

22 5 

.(2)539 6419 780 

- (4)680 0S9S 916 

23.0 

.(2)505 5744 586 

- (4)610 0441 976 

23 5 

, (2)474 3260 497 

-.(4)548 5160 065 

24 0 

.(2)445 5992 272 

-.(4)494 2969 537 

24 5 

,(2)419 1587 794 

- (4)446 3969 384 

25.0 

.(2)394 7661 411 

-. (4)403 9598 359 






192 


POLYXOMIAL APPROXIMATION 


TABLE VII~fContinuedl 



C' 

D/ 

3*5 

357 6197 452 

- (1)173 0659 191 

4,0 

(1)798 5119 048 

- (2)282 3837 868 

4.5 

.(1)269 7269 968 

-.(3)732 8051 663 

5.0 

,(1)111 5137 722 

-.(3)241 1381 219 

5.5 

.(2 ) 523 4528 750 

-,(4)925 0558 091 

6.0 

.(2)268 7939 211 

-,(4)396 2769 318 

6.5 

.(2)147 7366 521 

-,(4)184 7282 763 

7.0 

(3)856 9155 732 

- (5)921 2018 141 

7.5 

.(3)519 3903 212 

-.(51485 5539 780 

8 0 

.(3)326 6117 660 

-,(5)268 1183 076 

8.5 

.(3)211 9254 433 

- f5)154 0554 574 

9.0 

.(3)141 2859 194 

-.(6)916 1750 131 

9 5 

.(4)964 5009 167 

-.(6)561 5350 601 

10.0 

.(4)672 3458 517 

-.(6)353 4749 049 

10.5 

. (4)477 5025 292 

-.(6)227 8602 485 

11.0 

.(4)344 8411 857 

- (6)150 0558 578 

11.5 

(4)252 8236 420 

-.(6)100 7434 807 

12.0 

(4)187 9176 343 

-. (7)688 3248 748 

12.5 

(4)141 4318 835 

-.(7)477 8804 517 

13 0 

.(4)107 6721 454 

-. (7)336 6794 370 

13.5 

,(5)828 3962 809 

-,(7)240 4245 028 

14 0 

.(5)643 5779 618 

-.(7)173 8435 375 

14.5 

.(5)504 5238 369 

(?)127 1625 885 

15 0 

,(5)398 8437 502 

-,(8)940 2155 070 

15 5 

,(5)317 7726 592 

-.(8)702 1757 322 

16.0 

.(5)255 0351 049 

-.(8)529 3336 632 

16.5 

.(5)206 0875 950 

- (8)402 5511 467 

17.0 

,(5)107 6061 262 

- (8)308 6648 3S3 

17 5 

(5)137 1349 784 

-.(8)238 5149 470 

18.0 

.(5)112 8433 182 

-.(8)185 6576 541 

18,5 

,(6)9.33 5432 357 

-.(8)145 5130 248 

19.0 

.(6)776 2421 574 

- (8)114 7942 689 

19 5 

(6)648 5562 423 

-,(9)911 2100 394 

20,0 

.(6)544 3500 271 

- (9)727 5436 127 

20 5 

.(6)458 8701 286 

-.(9)584 1368 200 

21.0 

.(6)388 4099 727 

- (9)471 4844 101 

21,5 

.(6)330 0128 092 

-.(9)382 4793 533 

22.0 

. (6)281 5293 946 

-.(9)311 7704 559 

22.5 

.(6)240 9925 109 

-.(9)255 3016 408 

23.0 

.(6)206 9976 250 

-.(9)209 9789 105 

23.5 

(6)178 3794 880 

- (9)173 4277 959 

24 0 

.(6)154 1991 740 

-.(9)143 8154 281 

24 5 

.(6)133 6979 531 

- (9)119 7203 889 

25,0 

,(6)116 2537 745 

-.(9)100 0289 166 





H. r. DAVIS 


193 


TABLES VII— (Continued 


/:> 

/f' 

r' 

3.5 

1.579 8913 122 

- 291 1769 387 

4.0 

.344 9276 620 

- (1)455 1504 630 

4 5 

.115 4442 217 

-.(1)115 8781 297 

5.0 

.(1)475 1410 972 

-.(2)377 5757 988 

5.5 

.(1)222 4862 588 

-,(2)144 0406 919 

6.0 

.(1)114 0808 001 

-,(3)614 9607 748 

6.5 

.(2)626 4468 334 

-,(3)286 0508 258 

7 0 

1 ,(2)363 1390 089 

-.(3)142 4423 327 

7.5 

(2)220 0141 501 

-.(4)750 0507 440 

8.0 

,(2)138 3131 064 

-.(4)413 8794 788 

8.5 

.(3)897 2722 669 

- (4)237 6853 941 

9.0 

.(3)598 0994 308 

^ (4)141 2990 508 

9.5 1 

.(3)408 2508 260 

-.(5)865 7917 011 

10 0 

.( 3)284 5633 018 

-.(5)544 8786 925 

10 5 

.(3)202 0841 107 ' 

-.(5)351 1847 710 

11.0 

(3)145 9325 974 

-.(5)231 2393 114 

11.5 

, (3) 106 9873 276 

- (5) 155 2312 455 

12.0 

.(4)795 1834 088 

-.(5)106 0518 396 

12.5 

.(4)598 4598 613 

-.(6)736 2301 182 

13.0 

.(4)455 5973 535 

-.(6)518 6638 399 

13.5 

. (4)350 5159 780 

(6)370 3629 282 

14,0 

.(4)272 3103 032 ^ 

-.(6)267 7874 363 

14.5 

.(4)213 4710 252 

-,(6)195 8739 375 

IS.O 

(4)168 7544 633 

-.(6)144 8213 551 

IS. 5 

.(4)134 4512 878 

-,(6)108 1535 449 

16.0 

.(4)107 9058 431 

(7)815 2967 606 

16.5 

.(5)871 9546 641 

-.(7)620 Oils 559 

17.0 

,(5)709 1357 523 

-,(7)475 4002 735 

17.5 

.(5)580 2104 892 

-.(7)367 3519 091 

18.0 

.(5)477 4317 347 

-.(7)285 9398 751 

18.5 

.(5)394 9737 353 

- (7)224 1091 1.35 

19.0 

,(5)328 4199 721 

-.(7)176 7967 053 

19 S 

,(5)274 3965 893 

-.(7)140 3360 406 

20,0 

(5)230 3075 611 

-.(7)112 0487 209 

20.5 

,(5)194 1416 638 

-.(8)899 6217 202 

21,0 

(5)164 3306 063 

- (8)726 1234 346 

21 5 

.(5)139 6233 668 

-,(8)589 0458 851 

22.0 

.(5)119 1105 861 

-.(8)480 1471 084 

22 5 

(5)101 9599 .133 

- (8)393 1799 685 

23.0 

,(6)875 7714 928 

-.(8)323 3791 671 

23,5 

,(6)754 6921 922 

-,(8)267 0876 406 

24,0 

.(6)652 3888 702 

-,(8)221 4825 166 

24.5 

.(6)565 6513 791 

- (8)184 3745 724 

25 0 

.(6)491 8478 631 

-.(8)154 0485 226 




194 


POLYNOMIAL APPROXIMATION 


TABU VII- ( Continued ) 




H' 

3,5 

,(1)143 3986 442 

.(1)545 8043 981 

4 0 

.(2)165 3852 513 

.(2)616 8981 481 

4 S 

.(3)325 3719 022 

.(2)120 1878 234 

S 0 

. (4)847 0633 024 

.(3)311 2518 155 

S.5 

,(4)264 7923 5l0 

,(4)969 9931 081 

6,0 

,(5)944 9259 723 

.(4) .345 4902 916 

6.5 

.(5)373 3302 227 

.(4)136 3314 118 

7,0 

.(5)160 0110 167 

.(5)583 8397 737 

7,5 

.(6)733 3626 200 

.(5)267 4300 767 

8.0 

.(6)355 6040 392 

.(5) 129 6221 514 

8.5 

.(6)180 9471 484 

,(6)659 3767 449 

9 0 

. (7)960 0363 156 

.(61349 7016 94'' 

9 5 

,(7)528 3674 230 

. (61 192 4628 42 

10,0 

.(7)300 3768 815 

. (6) 109 4008 386 

10,5 

.(7)175 7761 725 

.(7)640 1325 618 

11.0 

(7)105 5698 126 

. (7)384 4277 647 

11.5 

,(8)649 1137 736 

,(7)236 3567 152 

12,0 

.(8)407 7298 452 

.(7)148 4557 926 

12.5 

.(8)261 1497 436 

.(8)950 8148 611 

13.0 

. (8) 170 2826 250 

,(8)619 9571 695 

13.5 

.(8)112 8752 998 

,(8)410 9394 710 

14.0 

,(9)759 6818 749 

,(8)276 5670 073 

14.5 

.(9)518 5449 520 

(8)188 7758 141 

15.0 

.(9)358 6183 357 

. (8) 130 5524 791 

15. S 

.(9)251 0639 560 

.(9)913 9675 989 

16 0 

.(9)177 7849 669 

.(9)647 1965 291 

16.5 

. (9) 127 2480 223 

.(9)463 2203 741 

17 0 

.(10)919 9584 852 

.(9)334 8890 426 

17 5 

.(10)671 4105 266 

.(9)244 4091 600 

18 0 

.(10)404 3960 450 

,(9)179 9705 267 

18.5 

.(10)367 1241 847 

.(9)133 6401 141 

19,0 

.(10)274 7920 816 

. (9) 100 0289 750 

19.5 

.(10)207 2366 516 

,(10)754 3731 214 

20 0 

.(10)157 4099 489 

.(10)572 9940 142 

20.5 

.(10)120 3776 663 

.(10)438 1898 319 

21.0 

,(11)926 5399 250 

.(10)337 2711 214 

21.5 

.(11)717 5522 633 

, (10)261 1964 982 

22 0 

,(11)558 9721 931 

,(10)203 4711 966 

22 5 

.(11)437 8836 453 

.(10)159 3934 600' 

23 0 

.(11)344 8668 374 

,(10)125 5342 827 

23.5 

(11)273 0031 904 

.(11)993 7516 417 

24.0 

.(11)217 1767 993 

.(11)790 5382 130 

24.5 

.(11)173 5819 878 

.(11)631 8492 407 

25.0 

.(11)139 3623 589 

,(11)507 2869 968 


H. T. DAVIS 


195 


TABLE VII— (Continued' 


3.5 -.(2)270 9986 772 

4.0 -.(3)227 3478 836 

4.5 -.(4)343 6965 064 

5.0 -.(5)713 2482 623 

55 -.(5)182 5352 461 

6.0 -.(6)544 3210 423 

6.5 -.(6)182 6693 153 

7 0 -. (7)674 0025 445 

7.5 -.(7)268 9333 213 

8 0 -.(7)114 6212 362 

8.5 -.(8)516 9083 906 

9.0 -.(8)244 8220 580 

9.5 -.(8)121 0509 065 

10.0 -.(9)621 7703 060 

10.5 -.(9)330 4159 767 

11.0 -.(9)181 0370 007 

11.5 -.(9)101 9713 676 

12.0 -.(10)588 9829 883 

12.5 -,(10)348 0938 563 

13.0 -.(10)210 1044 796 

13.5 -.(10)129 2993 719 

14.0 -.(11)810 1043 368 

14 5 - (11)516 0618 886 

15.0 -. (11)333 8664 755 

15 5 -.(11)219 1292 642 

16.0 -,(11)145 7726 865 

16 5 -.(12)982 0445 539 

17.0 -.(12)669 4697 055 

17.5 -.(12)461 4992 604 

18.0 -.(12)321 4946 024 

18.5 -.(12)226 1959 235 

19.0 -.(12)160 6464 099 

19.5 -,(12)115 1112 743 

20.0 -.(13)831 8162 819 

20.5 -.(13)605 9221 411 

21.0 -.(13)444 7507 764 

21.5 -.(13)328 8288 593 

22.0 -.(13)244 8106 616 

22.5 -.(13)183 4686 278 

23.0 -.(13)138 3685 504 

23. S -.(13)104 9876 917 

24.0 -.(14)801 2235 814 

24.5 -,(14)614 8704 752 

25.0 -.(14)474 3701 124 



.(3)135 1095 994 
.(5)844 4349 962 
,(6)993 4529 367 
.(6)165 5754 895 

. (7)348 5799 778 
. (8)871 4499 445 
. (8)248 9856 984 
. (9)792 2272 223 
. (9)275 5572 947 
. (9) 103 3339 855 
(10)413 3359 421 
.(10)174 8728 986 
.(11)777 2128 825 
.(11)360 8488 383 

.(11)174 2028 875 
.(12)871 0144 373 
.(12)449 5558 386 
.(12)238 8265 393 
.(12)130 2690 214 
.(13)727 9739 432 
.(13)415 9851 104 
(13)242 6579 811 
.(13)144 2831 239 
.(14)873 2925 919 

. (14)537 4108 258 
,(14) 335 8817 661 
.(14)212 9981 931 
.(14)136 9274 099 
.(15)891 6203 434 
.(IS) 587 6588 627 
.(15)391 7725 751 
.(15)264 0206 484 
.(15)179 7587 394 
.(15)123 5841 333 

.(16)857 5225 577 
.(16)600 2657 904 
.(16)423 7170 285 
.(16)301 4909 626 
, (16)216 1633 317 
.(16)156 1179 618 
.(16)113 5403 358 
.(17)831 2774 587 
.(17)612 5263 839 
.(17)454 1098 277 



THE PRECISION OF THE WEIGHTED AVERAGE 


By 

H, MiwcivR Grtjzewska, Ph, Dr,, 
Warsaw, Poland. 


Introduction. We shall consider an infinite universe of el 
ments characterized by pairs of variable quantities , 

3 Regarding the values of as the weigl- 

to be assigned to the variates the weighted average of m; 
be denoted by xjy , i,e. 






All possible samples, each of N pairs of variates f 

that can be selected from the universe constitute the sample popi 

lation, 

Our problem is to obtain an expression for the probable pr 
cision of the weighted average Xy according to certain hypothes 
concerning the selection of the pairs of variates in various sample 
Professor Bowley discussed this problem in his paper on ‘Trecisi( 
of Measurement Attained in Sampling’^^ presented in Rome durii 
the Congress of Statistics 1925. In this paper Professor Bowh 
made no allowance for correlation between the variates and j/ 
In the present paper I shall attempt to eliminate this restrictio 
I am greatly indebted to Professor A L. Bowley for suggestio 
regarding the simplification of the proof of theorem II and for h 
general assistance in improving the form of this paper. 

Let us suppose : 

(a) the pairs of elements selected from the universe are indepei 
dent of each other, 


^Cambridge 1925. 



H. MJllCER GRUZBWSKA 


197 


(b) the number of pairs in each sample is so large that ^ may 
be neglected, 

(c) the frequency surface normal, i.e. the probability ^ 

that the particular pair will be selected is. 


> 2 ?^ — £ ^ 




-y 




where ^ and r designate the parameters character- 
izing the surface, 

(d) the a priori chance that the parameters of (c) are equal to 
given values may be defined by the fmctlonAT^yj X r) 
where this function is integrable, can be expanded in Taylor 
series and converges over the whole space. 

Let the calculated characteristics of the sample be, 

Xy the weighted average of a?,- with y/- as weights 4 — 

Y the arithmetic average of the variates -^N) 

X the arithmetic average of the variates ** *> ) 

5^ the standard deviation of the variates m ) 

»5y the standard deviation of the variates x/ J 

^ the coefficient of correlation between the variates and X 

The expressions representing the most probable values of the 
weighted average and its standard deviation are independent 
whether the parameters of the universe are known or unknown. 
In Parts I, II, and III we shall consider the respective cases, 

(a) when all parameters are unknown, 

(b) all but y are unknown, 

(c) all but y and ^ are unknown. 

In Part IV we shall consider the generalized case of Part I 

when there are H sets of elements, i.e. x^,y^ 








198 


PRECISION OF WEIGHTED AVERAGE 


in the universe. In order to consider this case we shall, at the be- 
ginning of Part IV, slightly change the hypotheses and modify the 
above notation. 


PART I 

Case Where All Parameters Are Unknown 

Theorem (1-1). If hypotheses (a) and (c) are satisfied anu 
if SyfJ- O then, the most pmhnble value of Xy IsXy. 

Proof. If ^ denotes the probability of getting particular 
pairs of variates, then it follows from hypotheses (a) and (c) that. 



/V 


A / 


W-r^}\ 












Taking the partial derivatives of with respect to c/y 

and r , setting them equal to zero, and solving for y cr^, cf^ 
and r , yields 


(2) 


cyy‘ 3y^ 

hence 

y.y; 

, Cfy - 3y and will 


make a maximum, and the maximum value of ^ is, 


( 3 ) 


P 


J 


Jy 277^1 


A/ 



H. MILICBR GRUZBWSKA 


199 


The weighted average and Xy can be expressed in terms of 
X: ^ ^ respectively, 

// 

-^*27 (by definition) 

AJ 




rv£Xj- y^- since iSy-Yh definition) 


hence, 


X- — 


(4)i 


y y 

similarly. 




(i<c^ify,,rXY.PS,Sy) 


r<fyCf^ 


TAtV proves theorem (1.1). 

Theorem (1.2). If all four hypotheses are satisfied and if 
S^Sy (jf~l^^ (? then the a posteriori pt'obability P that the 
sample came from the universe, the weighted average jiy of which 
satisfies the inequality | Xy-JJsTyj i 6" , can be expressed by, 


(S) 


V. 



where 




200 


PRncisioN or whicHTiin avuracb 


Proof, It follows froiii (4) that, 


X-X-- r- 

y r 

Substituting the above value of fy-X)\n (1) we shall have, 


( 6 ) 


1 V A/ 


, . . ■■■^ J e 

^ \ a )irCiy ^ 


w/ 


where 


We y - 2r/r - -- 7^ CJ~ r 


and 


V 1 y y I CXy y / 


Let: 


Xy ~Xy = y Sy ^ Cf^ -S^ ^ ^ 
y- r= X'jy. ay~Sy. X' 


yQj>r- ^ 


then, 


5 




rr7^» 


N 

e 


N 


w, 

in tvhich 




and 



H. MILICnR GRUZIWSKA 

P 

Taking the liii;aiithin oi - — we shall have, 

p ' Pip X ^ , 

/V 7^”" 

A-cor?st ^ foaPJpAl) 

iH) ^ 


^ /oyr/^A%JdyW^yC>/ 


Expanding A in tenr of the small quantities A' A" <^!d!'p 
to second powers .msive and letting obtain 


where 

^ AA I ‘'A 

J-P^AAMV r 

^ fj-pva-^v 

(S') S ' CJ-if<>{l- /i 7 \l-P^ 

y.^-T/--er/^2ir „ erAdVd! ~ V 
* 'r'^7/7’7 


./, _dl 

^ /td^ii-kA{i kv\ 


(Wcj shall make use of the above substitution in the next paragraph,, 



202 


PRECISION OF WEIGHTED AVERAGE 


Therefore ihe probability of getting a particular set of A/ pairs of 
variates can be expressed approximately by, 

(9) ^ a const times e 

Then it follows from hypothesis (d) and (8') that the a pos- 
teriori probability that the sample came from the universe — 
the weighted average of which satisfies the inequality 
whatever the parameters and r may be — is expressed 


( 10 ) 


J /■■■■/ ,...r)e 




■*^y 


■ oa -/ 









We may write, 






rj<? 


/- Ndf 






where 


cr since f-^^) - <2', 


'T X ry ^y, rj & 


-A/rAl/e*A^) 



H. MII.ICBR GRUZmVSKA 


203 


[ Let, pVfx.y-J^yJ 
and 



(12) 


I then, /^= / y,, 

<ep7 ''1'^' 

It follows from (S'), (11), and (12) and hypothesis (d) that 
^fw) can be developed in Taylor^s series for all values of A/, 
hence, 


(13) 


^,7;^ • J]- 


Neglecting terms of order of J we shall have, 


re/77 

but, / ee ' “'cZ 
V/V 


oa 

oa -Z!^ K 

-Ck? 


(odd function) 


and 


hence, /^« 










Let cf^ ,-^. and E’^/Nt 

iz77 


then 


i 






rr^ 


'df 


This proves theorem {L2) 



204 


PRECISION OP WEIGHTED AVERAGE 


PART ir 


Cask W here y is Constant* 


Theorcui (2,1), If hypotheses (a) and (c) are satisfied and 

f (1 a) cf = 


(l.b) 


(1 c) 
^ (l.d) 


where and arc the most probable values of 

and respectively; and, 


( 2 ) 




= 


-f. = 


y- ^ 

y-y 




Pf'oof, The probability of getting A/ particular pairs of vari- 
ates is given by (6) of Part I, Taking the partial derivatives of 
^ w ith respect to n and ^ and setting them equal 

to zero, we obtain. 


( 3 ) 


(3,a) 

(3.b) 

(3.C) 




^Case ^^hele all the parameters but y are unknown. 



H MfUCER GRUZBtVSKA 


205 


where and \V^ mean the partial derivatives of W 

^ y 


with respect to cr^, cXy and r respectively, 
But, 

iaj cr^ay 


^y-^y 


Wj, .ZVa - zf^y^ZrP 4 ^ - Z(I-r¥^) 

y ‘^y \ c^/ cr^CTy '■ I cTy/ 

Wr = z vv'-ze -zrf^y 

r a^cf \ cry/ 


since V'=0 obtain, 


( 3 ') 


(3'.) (i-rV-f^f.rP 

(3-,b) n-rV-f^^ rP ^^y-rV/^jU 
(3'.c) 


Uy/ J ‘ ay, 

Solving for cr cX.. and r from (3) and making use of the sub- 
^ y 

stitutions from (2) we get the most probable value of cr^<^and 

^ y 

f f 


m 


^ v 


:= Jy 


(4.a) 

(4 b) 

(4.C) 

and from (3.d) we obtain, 

(4.d) =Xy - -f cf, f 


Vj^nfj /rj, PM-, 



206 PRBCISIQ^^ OF WEIGHTED A I- EkADE 

(4M)) J’y = 

(4'.c) jrj- kf)/fl- /J7r V 

hence, 

Substituting the above value of P J'y in (4,cl) we obtain, 

(4',d) x.y ^ =JCy O', 

This proves theorem (2.1) 

If we denote the maximum probability by '^^^then, 



Theorem (2 2). If all four hypotheses are satisfied and if 
thtn the a posteriori probability, P that the 
sample came from the universe, the weighted average of which 
satisfies the inequality | - Xy \ i can be expressed by, 

( 6 ) 


Proof. Let then by substituting the 

values of and p from (4^.a), (4'.b) and (4'x) we get, 

2In this case the function ^T^'y,y,a'^,<fy,r)in (d) is 




H. MILICBR GRUZBWSKA 


207 



hence, 


( 8 ) 


<5j j l-r^ 




, ^ 

r^) ‘ 

P 


N 






( 9 ) 


' -V * /D 

raking the logarithm of -p^^'oA letting 
we shall have, 

1 B Pmax- _ A where 

/V ^ 

. Cond 1 /-/^ • 

Expanding in terms of the small quantities <^‘ a”d/^ 

we obtain, 

> ^(C^-yJ-V 1. ^ ^ „ 

. f<Cki-2k)^ ‘dr-^r/k^^ 

■Z(k^k/ )d;o ^ 

~Zr'Xl-l<(l<+2f(,)\ Z'/O-Zfe 


[ 10 ) 


. ,p: I A'‘^ 



208 


PRECISION OF WEIGHTED AVERAGE 


The expression representing the value of A, is quadratic in 
form in terms of the variables /s where all the co- 

efficients are positive, 

I 2-r^^^r/krk4.2k,) / 

, r. " \l<(Z-r:k,V.k/i-nf]\ d'.ll-kVj 

(i-rfj(li-kk,)^tfj-t-rfjkfi-kf)(/:i {ktkrr/k/J+kk,J\ d' 1 

^ 2\j.r/^^Wk^2kJ2 (H VCMk,)^fI.QVkYj.kf)] 

* 2 [(l-r/)(Ff kk,)^^rJ^r/JkVZ-kfJ\ 


For the rest of the proof of this theorem we proceed as in 
Part I and can obtain. 


( 12 ) 






('J -^k kf 

a-k,j^ 


Notice that if y» iK then. 


K,.k,. O , a-, = 5^ -'= 5 -'^ ,'2-^ cend ^Xy 

(13) cXy^cr^, J 2-P^-ck^ 


hence cry <eT if ?? 4 O where cr is given hy (5) Part I, 


PART III 

Case Where y and dy are Constatv^ts® 

Theorem (3.1). If hypotheses (a) and (c) are satisfied and 
if Sy ('jf-PAO then, 


®Case where all the parameters but y, dy are unknown. 



H MIIJCER GRUZEW.^KA 


209 


/ n Z 


a, ^ ^ Sy ^ 


sy -y 


(ii>) 




( 1 .c) 

X 5y I <fy y y ct 


-*y J. ■ S^{-^yJ'^^~^ 

where a,, Q and are the most probable values of and 
niy respectively and 

Y <Xy <^y y 

Theorem (32). If all four hypotheses* are satisfied and if 
Sx.'^y 1^0 then the a posteriori pfobability that the 

sample came from the universe, the weighted average Xy of which 

satisfies the inequality ( A^y -Xy ( < 6\ can be expressed by. 

^ (f 

^ . j e di where 




Notice that if y- X and < 5 ^ then^ 


^ ^ I Xy and 

cr^ = 

hence cXy < if 
ivhere ay* is given by (12) Part IL 

As the proofs of theorems (3,1) and (3,2) do not differ from 
the proofs of theorems (2.1) and (2,2,), we shall omit them * 

this case the function /?iy, /, (d) is ^ ) 

*Part I and 11 were presented in Wilno during the II Assembly of 
Poll*?!) MathcmatH jans. 





210 


PRECISION OP WEIGHTED AVERAGE 


PART IV 


In this Part we shall consider the generalized case of Part I 
where there are k sets of elements characterized by pairs of vari- 

able quantities, xj, y- 

k ^ 

^ Xy Af y An 

Let, -EXy~ 

ZAf ^ A 

where Xy is the weighted average of the variates , with 

as weights, and the sum of these weights. Our problem is to 
obtain an eJcpression for the probable precision of the quantity x 
according to certain hypotheses. 

We shall replace hypothesis (b) of the introduction by hypo- 
theses (b' ) and (b" ) where, 

(b ' ) the’ number {N = of pairs in 

each sample is so large that ^ may be neglected, 

(b " ) each of the numbers N^( 1^2,3. •••-.A) of pairs 

from separate sets is so large thati^has a significant value, i.e., 

' / 

Let us replace in hypothesis (c) by {^=-^A3y—A) 

and X, y, cJ^,c!y, r by ’"e^er to the 

corresponding general hypothesis by (c ' ). Likewise if in hypo- 
thesis (d) we replace Pfxy,y cl^,cfy, r by Ffxy,yfcrj,a^, r^) 
we obtain the generalized hypothesis (cf^ ). 


1 . 2 , 3 . , /r I 

1.^.3. .ooj 



/’/. MIUCBR GRUZBIVSKA 


211 


We ‘'hall flenote the calculated characteristics of the sample by 

corresponding to the values ^ defined 

in the introduction page 197. 

Theorem (4.1). If hypotheses (a) and (c' ) are satisfied 

and if (l-TS) 3x3y 40 then the most probable value of a: is Jf 
where, 

X= LXy ^ 

/ ^ 

Proof*. Let ^ be the probability of getting a given set of 
N pairs of variates f then it follows from hypotheses 

(a) and (c ^ ) that, 



^The proofs of the theorems (4.1) and (4.2) shall be given in very 
abbreviated form as the method of proofs of these theorems does not differ 
from the proofs of theorem (2,1) and (2.2) of Part L 



212 


PRECISION OP WEIGHTED AVERAGE 


x-X = D , Xy -Xy = ; 








cs^ Jy 

'J -^y‘ 




and we mu also express the unknown quantity d^~(A=l,Z,' ,k) 

in, terms oiZ> and the independent variable • ■,l<-i) 

as follows, 


\>r ' ’^^y' '"' ‘ 




Hence it follows from (I) that, 



H, MILICBR GRV2BWSKA 


213 


k 

P^rr . 
M 




g zii-rs^a->-p^)^2 


’r^nsjs^ri-A'^Xi^Ap 


where 


( 5 ) 




•2P. 


2 


a*Ap(j*Ap 




and 




where d^ are to be found from the equations (3) and(2f««5^"‘A'^ 
Taking the partial derivitives of 7^ with respect to D and 



we obtain, 

' 



dPn^ i ^ 1 , 

dPn 



dD ‘ k, 

dd^ 


(6) 

dPn 1 dPn 

1 

dPn 


^'Oi ^k '^^k 


dd^ 




It can be easily vertified that jf 




da 


»c7 


The probability treated as the function of variables 

"K-K ^ maximum when, 

This proves theorem ('4.1j. 



214 


PRECISION OP WBIGFITBD AVBRAGB 


Theorem (4.2). If all hypotheses are satisfied and if 
then the posteriori probability that the sample came from the 
universe, the quantity x of which satisfies the inequality £ 
may be expressed by^ 

6 

f ^ where 






Proof. Let P^^iC demote the maximum probability/ then it 
follows from (6) that, 


« -A^ A' i 

P' . 

fr)cix i ^ 




3l 

N k 

p ^ 

77 

max 

/ 



= e 

Tf 


/ 


Ji-P} 




Fii 








where the value of given by (5) and w/jp 

P 

As in Part I or Part 11 if we expand the terms of 

rv f H f k ^ 

^>’^ 1 ' *-^*'*” vanish 

is quadratic in form in terms of the variables, 

‘ ^Z>'’ N'Pl’ 'Pk ’ 



II . MIIJCBR CRUZmVSKA 


215 


and this in turn by linear transformation can be expressed as, 

( 9 ) ^ ^ ^ -2 a 7 - r 

^ ^ when 

/ 



To complete the proof we proceed as in Part I. 



ON CERTAIN RELATIONSHIPS BETWEEN AND 
FOR THE POINT BINOMIAL.H^ 

By Margaret Merrell 

1, Introduclion, 

The extensive literature on the point binomial covers studies 
on a variety of its properties, apro|K)S of its use as a discrete prob' 
ability function and of its approximate representation by certain 
continuous curves. Investigations on such properties as the sum of 
its terms within specified limits, the ratio of its ordinates, and the 
slope of chords connecting successive ordinates have thrown light 
on the characteristics of the point binomial and have suggested 
various continuous functions as substitutions for the binomial ex- 
pansion. 

Prominent among such studies have been those dealing with 
the moments of the binomial and it is with these properties that 
the present paper is concerned. The first four moments have been 
used by Pearson^ as a means of fitting a point binomial to observed 
data and he has pointed out" that these moments expressed in 
terms of /3j and /3^ approach the corresponding moments of the 
normal curve as r> becomes indefinitely large. Other papers that 
especially concern the following discussion are one by Student® in 
which he discussed the relationship between and /3^ for the 
point binomial and the Poisson exponential series, and one by Lucy 
Whitaker,^ in which the range of /<3j^ and and certain relation- 

*Paper No. 175 from the Department of Biostatistics, School of Hy- 
giene and Public Health, The Johns Hopkins University, Baltimore, Md. 

iPearson, K. Skew variation in homogeneous material. Phil Trans. 

(1895), pp. 343-414. 

^Pearson, K. On the curves which arc most suitable for describing the 
frequency of random samples of a population, Biotnetnka, Vol. 5 (1906) 
pp. 172-175. 

^Student On the error of counting with a haemacytomctcr. Biometrika, 
Vol 5 (1906), pp. 351-360 

♦Whitaker, Lucy. On the Poisson law of small numbers, Biometnka 
V'ol. 10 (1914), pp. 36-71 



MARGARET MERRELL 


217 


ships between the moments and the constants of the point binomial 
were discussed. 

The present note will give some additional relationships be- 
tween the third and fourth moments of the point binomial, in terms 
of and , and will discuss these relationships in connection 
with their bearing on the association of the point bmomial and the 
normal curve. The point binomial, has of course two 

constants and n which completely determine its characteristics. 
Certain of these properties are closely connected with / 3 j and , 
and it is therefore of interest to see how and change as /o 
and r? take on different values In order to see the effect of varying 
each of the constants, the relationship between and will be 
detei mined for varying values of 77 when is held constant, and 
for varying values of /d when n is held constant. In addition to 
these relationships, it is possible to see how the are related 
when both /d and are allowed to vary while certain functions of 
these parameters are held constant. In the following discussion the 
relationship between and will be considered for the cases 
where the mean, np , is held constant, and whet e the square of the 
standard deviation, Tipg, is held constant, r? and p being variable. 

The moments of the point binomial fp-h^J are : 


rr?&an ^ np 

- ^P9 
Ms = 

M4. ‘ 


These moments lead to the following values of the 


( 1 ) 




r)pg- 






( 2 ) 





218 


THE POINT BINOMIAL 


Although the point binomial is ordinarily applied only to the case 
where jO , ^ and r? are positive, it should be noted that thes 

formulae for and are not limited to this case. The onl 
limitation on these constants is that /o ^ 

2, The relationship between and for constant values of /a 
If we eliminate r? between (1) and (2) we obtain an equatio 
relating ^ , and/rj , being unspecifted. This equation is 


( 3 ) 


/Dn - ^ ; 

" r9 -pJ-^ 


A 





FIG / THE RELATIONSHIP BETWEEN f3, AND /?? FOR POINT BINOMIALS HA 
CONSTANT VALUES OF p AND CONSTANT VALUES OF F) 




MARGARET MERRELL 


219 


or fixed values oi/o , this represents a family of straight lines 

ith slope f , all passing through the position 

e position of the/^?>jfor the normal curve. Figure 1 shows a 
‘oup of these lines on the plane^ for the values of /d indi- 

ted. The various positions on these lines represent and/<^ 
ir point binomials having the specified p's and varying t) 's . Only 
ose values oi p which are between O and J are included in this 
agram, and these only for positive values of 77 , since these values 
>ver all the ordinary probability problems. From (1) it can be 
en that for such values of p and r? , is positive The /^s for 
nomials having parameters outside these limits will be discussed 
a later paragraph. 

The range through which these lines can swing can be deter- 
in,ed by substituting the limiting values oi p in equation (3). 
he values oi p to be used in determining this range are (P and . 

)t O and / r This follows from the fact that, since p and ^ are 
terchangeable in the point binomial, and the slope of the lines 
ven by (3) is symmetrical inp and ^ , the lines obtained for p 
Jtween .S and J would be identical with the lines for the com- 
ement of p, /-p y between O and , JT, Thus any particular line 
presents point binomials for two ccMuplementary values of p . 

If we substitute the value for p in (3), the equation 
jcomes 

► we would expect, since point binomials having ^ p or . J* are 
mmetrical. If p-<P , equation (3) becomes 

5 ) 

he latter line is identical with the line giving the relationship be- 
vten and in the Poisson exponential series. This is in 
irmony with the derivation of the Poisson exponential as the 
Tilting case of the point binomial as p tends to O , and r? tends 



220 


THE POIXf PIXOMl 1L 


to 0^7 , 7?/:? being finite. If we denote the mean of this series by 
w , the moments are”^ 








^ rr? ^ 


and 

J 

/^J ~ m 

A- i, ' 

We have thus the equation relating and as given by (5)/ 

The radiating lines giving the £oi point binomials having 
values of p between O and / , will therefore lie in the range 
between the veiticah and the Poisson exponential line. 
This range, which is indicated in figure 1 , was t’>ointcd out by Lucy 
Whitaker^ in the paper mentioned above. The Type III line is 
included in this graph to indicate that portion of the plane 

covered by this family of lines. It is of interest to note that /Sj 
and for skew binomials do not approach the^^ of the Type 
III curve, although Pearson^ has shown that in an important slope 
property the skew binomial polygon and the Type III curve follow 
the same law. 

The relationship hehveen and for constant values of n. 

The equations for constant values oi/o have been ex- 

pressed as continuous straight lines, but only certain positions on 
these lines pertain to binomials having integial values of ? 7 > These 
points are determined by the intersection of these lines with the 
curve relating , and ?-? , when is held constant at integral 

^Student. Loc cit. p. 353. 

^Whitaker, Lucy. Loc, cit p, 37 

^Pearson, K. Skew variation in homogeneous material. Loc. cit. p. 357. 



MARGARET MBRRELL 


221 


values. The equation of this curve given by eliminating /d and ^ 
between (1) and (2) is: 

( 6 ) 

This is a family of straight lines parallel to the Poisson expo- 
nential linCt a specified value of ?? determining a particular line. 
The intersection of any of these lines with any /p line determines 
the the point binomial of specified and rp . Figure 1 
gives the graph of three such lines for and 

From this graph it can be seen that even with an ?7 as small as 
/3j and /3^ for the symmetrical and slightly skew binomials are 
not far from the position of the /fs of the normal curve, but for 
the highly skew binomials, they are quite far from this position. 
This is in agreement with the fact that the more skew the bino- 
mial, the larger the n required to make the normal curve a good 
substitute for the binomial expansion. 

It is evident from this graph that as r? is fixed at increasingly 
large values, and for the point binomials of dififerent 
converge quite rapidly toward the normal position. The limit of 
equation (6), as 77 becomes indefinitely large, is equation (5), 
that is, the line giving /5j and for the Poisson exponential. 
This is to be expected, considering the conditions under which the 
point binomial approaches the Poisson exponential. For this lim- 
iting case the /3j/^ line for constant r? crosses all the radiating 
p lines at (9 , ^ , except the line for p-O y with which it coincides 
throughout Thus for all values oip y except and for 

the point binomial agree with the corresponding moments of the 
normal curve, as rp becomes indefinitely great. 

A. The relationship between and ^<3^ for constant values of 
r?p . 

In judging how adequate the size of a particular sample is, for 
a specified value of ^ , we frequently make our estimate in terms, 
not of 71 , but of the mean value, np . This is with the thought that 



222 


THE POINT BINOMIAL 


we are in approxiinately as good a position with a /O of / and an 
77 of 50, for example, as with a /o of 01 and an ?? of 500, since 
the expected number is the saine in both cases. For instance, our 
knowledge of a penny from /O tosses is about as complete as that 
of a dice from 30 tosses. Tu study iIih question from the moments 
of the binomial we can determine the curve relating the for 
binomials of constant /7/p , by replacing by ^ in equations 
(1) and (2j, and eliminating the remaining /pi and between 
these two equations. This gives the equation 

( 7 ) 

~ ^ /■;? y?? ~jJ iT? 

This is the equation of a hyperbola with asymptotes 

(8) 

(9) 

The substitution of any particular value of 777 in equation (7) 
will give the curve of /^ifor all binomials having this specified mean 
value. Its intersection with the radiating lines of constant p gives 
and for the poent binomial having the specified p and 
mean value Figure 2 shows the two hyperbolas for which X77*2 
and 10 respectively. 

Turning to the asymptotes of the hyperbola, it will be seen 
that only one of them varies with 777 Thus the various hyperbolas 
obtained by substituting different values of 777 in (7) will all be 
asymptotic to the same line (8). The centers of all the hyperbolas 
will therefore he on this line, which, it will be noted, is the Type 
III line The other asymptote is parallel to the Poisson exponential 
Ime and a comparison of its equation, (9), with ectuation (6) 
shows that it represents the same relationship l^etween /C^ , , 

and m, as that previously derived between , and r?. This 
asymptote is therefore the particular line ui the family of 77 lines 
for which 77 ^ 777 or Since any ixiint on the hyperbola is the 



MARGARET MERRELl 


223 



r/G e THe RELAT/ONSHfP BCTWeeN Pj and P^ for POtNF BINOMIALS HAV/NC 
CONSTANT i/ALUBS OF np 

intersection of ?? and /o lines such that r?/:> has the constant value 
777, it follows that points farther and farther out on the hyperbola 
represent the intersection of lines having values of 77 closer and 
closer to the mean, -n/D, and values of approaching the value 1 . 
If we consider this asymptote for hyperbolas of increasing values 
of 777 we see that it approaches the Poisson exponential line, and 
in the limiting case, as m becomes indefinitely large, the hyperbola 
degenerates into the two lines which represent the^o'for the Pois- 
son exponential series, and the Type III curve. These limits will 
be further discussed in a later paragraph. 




224 


THE POINT BINOMIAL 


All of the hyperbolas in the family represented by equation 
(7) are tangent to the line which is the value for binomials 
ol /p ^ J, From figure 2 it can be seen that the lines giving the 
yds for the other values oi /O are crossed twice by each hyperbola. 
This is due to the fact that each line rcpicsents the /ds for two 
complementary values of /o , and for such values oi /p the same 
mean value would result fiom two different values of ^ , such that 

777 - P7/Z> - 77 

For example, the binomials /^^ /• (^y**^*^ and have /o 

values that lie on the same siiaight line and mean values that lie on 
the same hyperbola It is thus obvious why the hyperbola must be 
tangent to the line f or ^ J” , since in this case the two complemen- 
tary values of /p are equal and theie can be only one value of 77 
which will produce a given ?r?. 

From the discussion of lines relating^ and for constant 
n values, it is evident that of the two crossings of any line, the 
one nearei the Gaussian position is for the point binomial with the 
larger 77 and therefore for the smaller of the two complementary 
p values. Furthermore, through the point of intersection of two 
hyperbolas, only one r? line and one p line will pass, and the 
/3s thus determined are for two binomials, one having the smaller 
mean and the smaller p , and the other the larger mean and p , 
both having the same n . For example, the two hyperbolas given 
in figure 2, intersect at the point = ,^666 The value 

of n for this position is , and for p is J/d or d/d. These 
values of and/,<^ are therefore for the binomials {^/d 
with the mean value and /^d/<5y^//(5j'^^^\'^ the mean value 

. 

From figure 2, it is seen that the hyperbolas extend into the 
area between the Poisson exponential line and the Type III line. 
Since, as stated above all /ds between O and I fall in the area be- 
tween^^^ and the Poisson exponential line, it raises the ques- 
tion as to the meaning of the hyperbola outside that area. In the 



MARGARET MERRELL 


225 


point binomial ^ , there is nothing to force /p to he 

between CP and /, and to be positive except the conditions we 
impose for applications to probability probleins. If we consider 
the general case, without these limitations, an analysis of equation 
(3) shows that the radiating lines giving /3^ and for fixed 
values of /p continue into the area between the Poisson exponential 
and the Type III lines, the values of /O in this area being either 
negative, or the complements of these negative values, that is, pos- 
itive values greater than J . This area includes all values of ja 
from O to - o£5 , and from I to t* os . Thus, lines giving 
/Sj, and for ail real values of /:? from - c^a to ck? are 
included between the vertical and the Type III lines. Outside this 
area, the values of/? are imaginary. Turning to the values of n , 
we can see from equation (6) that in the family of parallel lines 
giving/^ and for fixed values of r? the lines below the Poisson 
exponential all have negative values of V 7 . As ?? approaches - oo , 



FIQ 3 SUaOMSiON OF THE , fie PLANE FOR BINOMIALS CLASSIFIED ACCORD! n6 TO THE 
/ALL/ES OFp AND H 





226 


IIIE POINT BINOMIAL 


this line approaches the Poisson exponential line from below. Fig- 
ure 3 shows the subdivisions of the plane for the various 

cases. 

It follows from this discussion that the values of the hypeibola 
in the aiea between the Poisson exponential and Type III lines 
give /5’J' for jxiint binomials with negative and since is a 
fixed positive value for any hyiierbola, negative values of /d , rang- 
ing from O to - , Thebe results are in harmony with the case 

mentioned above, where the hyperbola degenerates into two straight 
lines (the Poisson exponential line and the Type HI line) as the 
mean becomes indefinitely large, for pv/? approaches Oi? when 
either /? or /d approaches cv Thus, the Poisson exjxinential line 
IS the limit when ^7 approaches 00 , and /p--/ , and the Type HI 
line when /O approaches - , and n is negative. 

5 The relationship between and for constant values of 

A further point of interest is the scatter of the^^for point 
binomials of varying /cdIs but constant standard deviations. In 
equations (1) and (2)^ if we let 
^ we have the equation 

( 10 ) 44 


77/^^ ^ and eliminate /o and 


6a- -J 
cr^ 


^ O 


This, like the lines giving and/^ for constant values of P7 , is 
a family of parallel straight lines, but where the r? lines were par- 
allel to the Poisson exponential line, this group is parallel to the 
Type III line. These lines intersect the radiating lines giving the 
/<3!s for constant values of /d , and as increases, the points of 
intersection of these lines appioach the and/<^ for the normal 
curve. As approaches , the line given by (10) approaches 
the Type III line, and in the limiting case, crosses all the /d lines 
at the Gaussian position. 



MARGARET MERRBLL 


227 


d. Summary. 

Certain relationships between the third and fourth moments 
of the point binomial in terms of have been discussed 

and the following results have been brought out : 

A. For fixed values of ^ and/^ are linearly related, 

forming a family of radiating lines, all passing through the posi- 
tion of the for the normal curve. Each of the lines represents 
the/<5’i‘ for point binomials having a fixed value /p or its comple- 
ment, -/-/p . The lines for values of /O between (7 and y are 
included between the vertical , which is the line for J* 

and the Poisson exponential line, d?, which is the line 

for or J . The lines for negative values of p or positive 

values greater than 1, fall between the Poisson exponential line and 
the Type III line, -<5^0. For the rest of the plane] 

the values of p are ims^inary. 

B. Although it has been shown by Pearson that in certain 
slope properties the skew point binomial resembles the Type III 
curve, none of the binomials which we interpret as probability func- 
tions, that is. those haying p between O and Jf and r? positive, 
has and ^ approaching those of the Type III curve, except 
for the special case where the Type III curve becomes identical with 
the normal curve. 

C. For fixed values of and determine a series 

of straight lines parallel to the Poisson exponential line. For posi- 
tive values of ti , these lines are above the Poisson exponential 
line, and for negative values below this line. Intersections of these 
lines with the radiating p lines determine/:^ and for the point 
binomial of specified p and ?? . As 77 is held constant at in- 
creasingly great values, the points of intersection are closer and 
closer to the position of the for the normal curve, and in the 
limiting case the line oi/^s for constant r? intersects at the normal 
position all of the family of p lines except the line for p-O ^ 
with which it coincides. 

D. and for point binomials of varying p and n , 



228 


THE POINT BINOMIAL 


but constant mean values » He on a family of hyperbolas, a particular 
hyperbola Being determined by a specified mean. One of the 
asymptotes of all these hyperbolas is the Type III line, and the 
other is a line parallel to the Poisson exponential line, at a distance 
from it, depending on the value of the mean. The limit of this 
asymptote as the mean approaches is the Poisson exponential 
line. These hyperbolas are tangent to the line , (the line 

for and cross the other ja lines twice, the intersection of 

the hyperbola and any /y line or any r? line determining the.^*s 
for the point binomial whose /y and /? are defined by the intersec- 
tion. 

E. For varying /y and but fixed n/yg: lie 

on a family of straight lines parallel to the Type III line, one line 
in the group being determined by a particular The limit of 

these lines as r?/y^ is held constant at increasingly large values is 
the Type III line. 





EDITORIAL 


NOTE ON THE COMPUTATION AND MODIFICA- 
TION OF MOMENTS 

For llio ]Hnpo.se of this note we shall deviate fiom the usual 
practice in the calculus of finite differences and define 

. 

where A/ is a constant. It follows that this generalized A and 
the symbol aic connected b) the opeiator relation 


A = (E- M} so that 
(E’M) ^ and therefore 


n 


(\)Au^.u 










If the n-th unmodified moments about an arbitrary origin, and 
about the arithmetic mean, be designated by l/ and , respective- 
1) . the usual relation may be written 



where M= i/^ equals the distance of the mean from the provisional 
mean, From (1) and (2) it follows that 



230 COMPUTATION AND MODIFICATION OF MOMENTS 


that is, the moment about the mean is simply equal to the n-.Th 
leading difference of , Since computing machines are ideally 
adapted to computations of the type {A -3 Cjt formula (3) is very 
effective. 

As an illustiation, let us compute the hist seven moments about 
the mean for the distribution of weights of 7749 adult males born 
in the British Isles. (See Yule's “Theory of Statistics”, p. 95.) 
The provisional mean is taken in the 150-lb. class, and the class 
interval of ten pounds is taken as the unit of X . 


Table 1 


n 



231^ 


A\ 

23^1^ 

0 

7749 

1 000000 

000000 

4.55356 

6 92736 

91.3249 

1 

1726 

222738 

4.SS3S6 

7.94161 

92.8679 

456 762 

2 

35670 

4.60317 

8 95586 

94.6368 

477.447 

4471.10 

3 

77344 

9 98116 

96 6316 

498.526 

4577.45 


4 

766026 ! 

98.8548 

520.050 

4688.49 




5 

4200496 i 
38164290 ! 

542.069 

4925.06 

4804.32 





6 



n 

23^14 







1 

436.420 

4272.15 






2 

4369.36 



It is very important that the provisional mean be so chosen 
that M= Vf is less than unity — otherwise particular attention must 
be paid to the number of digits which are significant in the values 
of the various differences. 

Let us now discuss the modification of moments. 

Designating the general modified or corrected moment by , 
Sheppard’s formula for continuous variates may be written 




■'n-4 


j 

j^44 ^r7-6* 


SO that for moments about the mean, 



EDITORIAL 


231 


( ~ - ~i A 

~12 

Ms "^3 

- -/An — 1 

^ M6 ^ ^ ^6 ^ ^ /34-4’ 


In many distributions discrete variates are grouped into classes, 
each class containing k different values of the variable. The for- 
mula, corresponding to Sheppard's, for grouped-discrete distribu- 
tions may be obtained by employing the calculus of finite differ- 
ences, and was given without proof in an Editorial on page 111, 
Vol. 1, No. 1 of the Annals as follows, 






240 


3_ J 




1344 




^n-6^- 


Obviously the limit of (6) as k approaches infinity is (4). So 




232 COMPUTATION AND MODIFICATION OF MOMENTS 


In both (5) and (7) above it is to be understood that the class 
interval, A , is chosen as the unit of )C . 

The modified moments, , of the following table are ob- 
tained by applying formulae (5) to the values of , which were 
obtained as the leading differences of table 1. 


Table 2 


n 

^f7 

Mn 


1 

crn 

0 

I 1 00000 

1.00000 

1.00000 

1.000000 

1 

1 00000 

i .00000 

2.114292 

.000000 

2 

4.55356 

4,47023 

4 47023 

1 000000 

3 

6.92736 

6 92736 

9 45137 

.732948 

4 

91.3249 

89.0773 

19.9830 

4 45765 

5 

436.420 

430 647 

42.2499 

10.1929 

6 

427215 

4159.96 

89.3286 

46.5692 


Tables 3 and 4 shed an interesting light on the subject of mod- 
ification, and are obtained from the results of formulae (5) and 
(7). If the class interval be denoted by A , and the unmodified 
values of the standard deviation and skewness by and , re- 
spectively, that is 


(8) 


it follows that 






/ ye 'A 








( 9 ) 


cr^= CO 


^3 = 


60 ^ 




u- 


12 \ 


and 

where 


''A 





EDITORIAL 


233 


As mentioned before, the case of continuous variates is the special 
case for which 

To illustrate* for nur distribution taken fiom \ ulo, 
and by (8) 


cr^ . /a J 4 JJJJ 6 21. 339 J Jb^ 


/ 927-36 


= .712919 


- 2 ' = 463623 

From table 3, = 47k=oo we have 00 ==^ 5?^^7and con- 

sequently cx^ « 99^7 ^71,3391 Ib3 ) ~ Z1 14 ngrecing 
with the more exact value deduced from talilc 1, i.e. 21 14292 lbs. 

From table 4, ^ 47, k- \vc have cvil,(32S^n(\ there- 

fore - 1(923 (" 712919J = 7^2.9 , again agreeing with 

the more accurate \alue of table 2. 

By either interpolation of tables 3 and 4, oi l)y direct compu- 
tation of CO and co“^, greater accuracy may be obtained 

For an illustration of grouped-discietc variates we may refer 
10 pages 32 and 37 of VoL 1, No* 1 of the Annals For the so-called 

D(4.1) of Table IX, 7 --4, a; = 6 089> = 096. Hence 

^, = 7P , and since k= -^^the modified or adjusted values are 


= J- 7 9>733j =4 963 

o!.,. .096 f 1.078)^ JOS. 

o 


It should be observed that the factois oj and cu "^are indepe'n- 
dent of the number of variates, V , and are just as properly ap- 
plied even if the frequency distribution method for computing the 



234 tOMPUTATION AND MODlFICATiaN OF MOMENTS 

standard deviation, skewness, etc is not employed. Thus, if I com- 
pute the standard deviation for the weights of ten individuals and 
those variates are recorded to the nearest pound, then the resulting 

^ V A' 

theoretically, if not practically, should be modified. If it developed 
that aj - ZOib^ then for weights to the nearest pound A- 1 , 
and 


-999396 


If, on the other hand, the weights had been taken to the nearest 
half pound, or to the nearest tenth {X)und, the corresponding values 
for co would be 999074^ and 999999 respectively Only if the 
variates he discrete and k^l , or if the variates be continuous 
and be measured with absolute accuracy (which is impossilde from 
a practical point of view) so that O , can co- I , and modifica- 
tion be ignored. 

It should be cleaily understood, however, that She]:)j)ard’s cor- 
rections are merely expectations. We have no assurance that the 
use of one of these corrections in any single instance will increase 
the accuracy of that determination — it is quite likely that in any 
isolated case the modification will introduce a still greater error 
into the calculation As pointed out in pages 36-3S of Vol. 1 of the 
Annals, and clearly revealed in the included table IX, modifying 
eliminates only the systematic errors, and ignores the accidental 
errors which may be numerically greater and of opposite algebraic 
sign than the correction itselt. 

Lastly, one must remember that the mathematical theory under- 
lying ShepparePs Corrections assumes that both the frequency 



EDITORIAL 


235 


function and a sufficient number of its derivatives vanish at the 
limits of the distribution, Consequently the case of J-shaped or 
U-shaped distributions, or of data not actually classified into such 
distributions, is not covered by formulae (5) or (7). In any event, 
it is evident that the practical necessity for modification in all cases 
depends upon the ratio of the limits of accuracy of the measure- 
ments employ^ in the computations to the unmodified standard 
deviation. If we throw accurately determined variates into fre- 
quency distributions with large class intervals, A ,— and thus sim- 
plify certain computations, then we must realize that the introduc- 
tion of a systematic error is the penalty paid for such procedure, 

that the greater the value 4 , the greater the penalty, and that 

cy 

the practical necessity for modification rests entirely upon the ac- 
curacy demanded of the final results. 

The chief value of tables 3 and 4 is that an inspection of these 
tables gives a rough idea of the value of modification so far as the 
standard deviation and the skewness are concerned. 





236 


COMPUTATION AND MODIFICATION OF MOMENTS 


















PDITORI \L 


237 
























23S 


coMPri.riiox iv/; MoniPicAriox oh momekts 


Table 4 


[/- 

■ L ' < 














EDITORIAL 


239 












SIX DOLLARS PER ANNVM 


NOVEMBER 


1933 


PUBLISHED QUARTERLY BY 
AMERICAN STATISTICAL ASSOCIATION 


FitMicaCion O^iee^Edwards Brothers, Inc., Ann Arbor, Michigan 

Office— SiO Conwierce Bldg,, New York Univ., New York, N. Y. 

Entered as second class matter at the Postoffice at Amv Arhor, Mich., 
under the Act of March ird, 1879, 


STATEMENT OF OWNERSHIP 
' UNDER ACT OF CONGRESS OF AUGUST 24, 1912 

American Statlstlcgl Assodatbn, New York City, New York. 

©Slfilflf*'— H. C, Carver, University oI Michigan, 

l^itar—'Sis.iti.u S. Sekhon, Ann Arbor, Michigan. 

Ri(4te«4f Manager— }, W, Edwards, Ann Arbor, Michigan. 

-0'^«e*v-Alhericaii Statistical Association, 530 Commerce Bldg, New York 
City. 



the extended probability theory for the 

CONTINUOUS VARIABLE WITH PARTICULAR 
APPLICATION TO THE LINEAR 
DISTRIBUTION 

By 

H. P. Lawtheb, Jr. 

The engineering worker is often confronted with the necessity 
of utilizing a group of quantities concerning whose numerical val- 
ues it is known only that they lie between definite upper and lower 
limits. If a number n of specimens is selected from such a group 
and the sum of the n values taken, intuition rules that there is 
negligible probability that this sum will be as great as n times the 
upper limit or as small as n times the lower limit, and that the 
most probable value must be intermediate between these two ex- 
tremes. Some assurance is desired regarding the practical limits 
within which such a sum may be expected to fall. While the dis- 
tribution oi the individual values within their limits may be un- 
known directly, yet workable inferences frequently may be made 
from the nature of the quantities. For example, in many manu- 
facturing operations it is economical to turn out items (such as 
bearing balls, paper condensers, or spacing washers) in large quan- 
tities with rather coarse precision. By means of gauges set to 
limits narrow as compared with the total spread, the product is 
then selected into bins, and in the operation of assembly a com- 
pleted article utilizes the material from a single bin. The contents 
of any such bin clearly may be expected to follow a linear distri- 
bution very closely, and if the relative proportions of the product 
finding their ways into this bin and its immediate neighbors can 
be learned, the distribution may be specified with practical accu- 
racy. The linear distribution is thus fundamental to a large clars 
of problems. 

On several occasions the writer’s speculations have led to pro 





THBORY 

'"'>r Ihr . 

*^-’W ?«, Stid the reference literature 

'‘■f'le fiutT^^k ®P®cial case of the rectangu- 

V ‘^Mmy ^ formulated by I^place’^ 
I^'Vin,® Hall,* and Craig,® in 

W ftif. y®®s applicable to the study of the 

!41b^, „ ^ ®°™ewhat different viewpoint. In 

® processes of specialists in this 

-i-i, forth considerable independent 

"»' y!|? ”* \ ’^t's factory understanding. As the 

T approach was developed. 

Y isu t terminology familiar to one whose 

„ .. limited to that commonly 

« fiir^^Kienng course, and should be readily 

^ i. I -I-* 

need of workers. Encouragement was 
1 generalized linear distribu^ 

t uiw^ilw >ch (.Q ^ new. In the application to 

carry out certain tedious compu- 
values and curves. The results of this 
the thought that they may stimulate the 
0i of a if^vv of considerable application to 

imJllSif that when a selection, or the sum of n 

there is meant the dimension of that selec- 
^*^nsions of the n selections. Following 

des ProhabiHit^s, Troisidme Edition 

l>aw of Probability of Laplace^ Proceedings of 
Congress, Toronto C1924), vol, 2, pp. 795- 

^ /^ijiriViwn’on of Means of Satnpies from a 

with Finite Moments, with special 
BSometrika, vol, 19 (1927), pp, 226-239, 
o/ Means of Samples of Sise N draum from a 
values between 0 and 1, all such values 
vol, 19 (1927), pp, 240-244, 
of i^ertatn Statistics. America 
..« 0932), pp. 353-366 


American Journal 





// P. L.'UVTHnR, JR, 243 

the usual notation, the symbol will be defined to be such that 

b 

the integral equal to the piobability that the sum 

a 

of n selections lies between the values a and b . Neglecting 
higher orders of infinitesimals, the probability that the sum of n 
selections lies between and would then be equal to the 

product (x)‘ Ax The sum of n selections clearly is the sum of 
n - 1 selections plus the value of an additional selection The prob- 
ability that the sum of n selections lies within the interval a; to 
X Ax must then be equal to the summation of the probabilities 
associated with all possible pairs of values for the sum of the first 
n-1 selections and for the last selection, respectively, that can 
yield a final sum lying between X and x-hAx The values x-m Ax 
and m Ax y where m is an integer, are such a pair, and the totality 
of these pairs is obtained by extending m to all possible values. 
Recalling that the probability of the simultaneous occurrence of 
two independent events is equal to the product of the probabilities 
associated with their individual occurrences, there may be written 
in the conventional symlx^ls 

m-°c 

'Ax. A%)-Ax f|(nn Ax) Ax, 

Setting m Ax = 'A , and passing to the limiting form, there iwS 
obtained 


-00 

as the general formula for determining all subsequent from 
f, (x) . The form of the function f, fx) is, of course, determined 
for any particular case from the best available physical data. T^he 
expression for is then obtained by n-1 successive applications 
of the operation of integration indicated above. For the sake of 



244 THE EXTENDED PROBABILITY THEORY 

subsequent brevity, the operator P will be defined to be such that 

CO 

P(p{*) ^ A )• f, (A)- dA 

Usin^ this notation it would Ixi written 

f^U) = f,W. 

For the general linear distribution f ^ M is given as follows : 
f^(>i) = 0 , for - 00 < X < 0 
(x) = 0 , for Cl < ^ < oo , 

the equation of a straight line for Oix = o. j subject to 
the conditions: (1) the area under the line from to is 
unity, (2) no ordinate is negative for any value of in this in- 
terval By imposing these conditions upon the general equation to 
a straight line there is obtained 


where the parameter K is restricted to the values - i=K=i . With 
so defined it can be inferred immediately that f^6i)niust be 
identically zero for all negative values of X , will have some positive 
finite value everywhere in the interval 0 < < na , and must be 

identically zero for all values of X greater than na . Also, since 
f^(x) IS discontinuous for x and for x the application 
of the operator P must be effected through proper choice of limits 
of integration. In this connection three possible cases arise: 

Cose J : Where x , the sum of n selections, lies in the interval 
0 £ X ^ d it could have resulted only from a value for the sum 
of nd selections lying in the interval to 0 , coupled with a suitable 
value for the n-th selection lying in the interval 0 to X . For this 
case the operator P will be distinguished as follows : 




If. I‘ I lit I }!}•}<, Jl< 


2^5 


Case 2: \\ heic , tlie sum of n selections, lies ni the inter- 

val >c i (n-l)a it could have resulted only froma value for the 
sum of n-1 selections lying in the interval to ;6-a , coupled 
with a suitable value for the n-th selection lying in the inteival 
O to a . For this case the operator P will be distinguished as 

follows : ^ 

W = P (A)-ciA ,ior a^x = (n-i)a . 

O 

Case 3: Where x , the sum of n selections, lies in the interval 
(n-lja^x ~ na it could have resulted only from a value for the sum 

of n-1 selections lying, m the interval (n-i)a to X-a, coupled with 
a suitable value for the n-th selection lying in the interval x-('n-l) a 
to a. For this case the operator ~P will be distinguished as follows : 
a. 

^ ^n-i ^ "^n-i Cx-A) f. O ) d/\ for,('n-l)a^x^ na. 

The procedure ikjw is analogous to that employed in establish- 
ing the binomial theorem The first few f s are obtained by hand- 
power methods, until the sequences can be discerned and the ex- 
pression for can be inferred The expression for f^fx) is then 

established, first by apjdying to fx) the operator P and showing 
that this yields an expression for wholly consistent with that ' 

for fpjfx) when n+1 is substituted forn, and finally by showing 
that it degenerates into (\:) when n is taken as 1 . 

The preliniinaty steps, while lery necessary, are quite tedious, 
and there would be no value in repeating them here. Suffice it to 
state that by such means it can be inferred that M is of the form 




246 


THE extended PROBABILITY THEORY 


where it is understood that each term including a bracket member 
of the form [na - is to be assigned the value zero for values 

of X which render this negative. The use of brackets "^3 dis- 

A 

tinguishes the operand in each term. The symbol -p denotes the 

operation of integration with respect to x between the upper limit 
X and that lower limit for which the Integrand vanishes. Thus 


i 

P 


[na-ba-xi] 


rr? 


1 

m+1 


[ na - ba - >6j[ ' 


. Where p occurs 


with a negative exponent it signifies the inverse operation of differ- 
entiation with respect to x . The symbol (^2) means — and 

is one of the familiar binomial coefficients. 

Preparatory to establishing the validity of the inferred expres- 
sion for f^(x), it is convenient to assemble certain working ma- 
terial. First will be defined as follows : 

-("Xl.K*®) (i-K*|^)[na-3a-»J 

where the symbols all have the same meaning as before, but here 
there IS no special understanding regarding the bracket members 
of the form [na-ba-Aj] and they are to exist for all values of % . 
Especial note should be made of the inclusion of a final term in 
, Otherwise the expression is identical m api>earance with 
that for fp^)for the interval O^x^a. Next, the typical opera- 
tion 



n. p. i.awthur. JR. 


247 


is readily evaluated and yields 

Now it can be written immediately that 


Collecting terms, this becomes 







and this is seen to correspond exactly with the expression for 
if n+l is substituted for n. Consequently, (p^ (x) must be the 



J4H I llh EXlHNDfU) PHOBAIULITY IHEORY 

result of n-i successive applications ot the operatof to a 

certain tP^ given by 

which IS seen to be identically zero for all values of x> ^ Now the 
application of the 0 |Xirator to zero yields zero. Therefore 
(p^ (a;) must be identically zero for all values of x . Finally, it is 
convenient to evaluate the typical o]_xjration 

If (a-A)J-dA . 

This yields 

The expression fot f ^ now may be established in straight-for- 

ward fashion. 

In the interval 0^ pci a ^he expression for (x) will con- 
clude with the term involving {p-”^ * As has been shown before, 
the expression for (x) should then be given by 

^ ^ for O^PC^a. 

This operation may be evaluated readily, and will yield a result 
that is correct. The form of the result, however, is such that it does 
not display the desired correspondence with the expression for 

(>:) . The expression for (p^ (x;) is introduced here to advan- 
tage Let It be written 


for 



// P. LAlVTlUiK. JR 


240 


Since the last term is identically zero its introduction is peimissi- 
ble. Remembering that (•;<) consists of all the terms of f (x,) 
plus a final term in , a rearrangement may be made ffivint; 

Using the operations that have been evaluated above, there is ob- 
tained immediately 

fri+K+l^) [nQ+a-x]-('l+K4-|^)7l-K+|")[na-JiJ 


1 ) ) [ na-A;J - — 

+("Jj('l+K+^) Cl-K+l^/Cna-a-x] 




fl)" )Xa-^l},for 01;^ I a. 


Collecting terms, this becomes 

sn+1-2 




pj i' 


ap 


n+1^ 

1 




ap/ 

2Ks^/, . 




ap' 
for O^x^sa 



250 


run nxrnNDHi) probabjuty thbory 


and this is seen to be wholly consistent with the formula for f^(x) 
for the same interval with the substitution of n + 1 for n . 

Now for some interval between and say (na-ba-a) 
^ X ^ (na- ba) where b is an integer having any value from 
zero to n- 2 , the expression for (x) will conclude with the 
term involving [na-ba-xj] , For the interval immediately preced- 
ing, namely (na-ba - Za)i^ ^ ("nc-ba-a), the expression will 

include the additional term involving [na-ba-a-x]. Therefore in 
a 

evaluating the operation which as has been shown be- 
fore should yield the expression for f the integration of all 

but the [na-ba-a-PcJ^ term will be carried over the complete range 
of A from 0 to a . The term involving [na-ba-a-^i] will not 
enter into until A reaches the value na+ ba +a), and 

so the integration of it will be between the limits (';>cJ^na+bQ+a)and 
a . Using the operations that have been evaluated previously there 
is obtained immediately 



for {na - ba-o.) e A/S (^na-bd}. 



H. P. LAWTHBR, JR, 


2Si 


Collecting terms, this becomes 




a 


n+l 


/^sn+Hr 
■('py 


K+l^ ) (i- K+l^ j [(n+l ) a-a- ^] - 


n+l , 


for (na-ba-a) ^ 3(; ^ (na-ba) . 

and this is seen to be wholly consistent with the expression for 
for the same interval with the substitution of n+l for n . 
Finally, for the interval na ^ i ("na+a)the expression for 
identically zero. For the interval immediately preceding, 
namely (md^o.) ^ aj = na , it consists of the single term involving 
[na-Aj ]. Therefore 


d z' 4 ^ n+1'2 X 2 ^ . *^+l „ , . . 

““n+iiDj — ) [(n+l)a-A?] for na5Aii('n+l)a 

and this is seen to be wholly consistent with the expression for 
for the corresponding interval. 

Setting n*l in the general expression for f^C^) for the in- 
terval O ^ Atf = a there is obtained 

Carrying out the indicated operations, this becomes 
f|C;«;)«|-[l+K-^ra->5)], for Oix,icx 


and this is the f, (k) chosen at the start. 

In the derivation of from fp C^) it was assumed that 

the two forms of the operator P , namely P and °P , are com- 

mutative with the operator ^ when applied to an operand of the 



2=ii Till. r.M l■.NU}■.D J'h'OH IHII ITV THEORy 

form [£)->] It IS vci) i'ns\ t<i show thal^P.p [B-x:] yickK a result 

KlciitRal witli that (if [S.x,] and that [£> ;f-] 

vieltls a icsult idiMitual with tliat m r P I , and iW 

spacf win not lx* taken Iiere u> i;i\c tins ckmunHtraunn 'fhe lor 
niula foi ’f^fx)niay lliu^ he rcgarflect as 6imly esta1)lished. 

It has been showm that the coni[)lote expression (x) is iden- 
tically zeio for all values of % 'Pheiefore, tf in the interval 
(na-ba-a) $ ^ ( ri a - ba ) the desired function (k^can be 

represented liy tlie pailial expression 



it follows that it ma\ equally well lie lepresented in the same inter- 
val by the negative of the remainder of the complete expression, or 





Kemembering that j, 

rewritten 




'I \m2 
,P 

'n 




ZJ\^ ^ ap, 





H. P, LAWTHBR, JR 


253 


where again it is understood that each term including a bracket ’ 
member of the form is to be assigned the value zero for 

values of which render this negative. The having of these two 
forms of expression for (x'S is very valuable in computation 
work, since it limits the number of terms that have to ]:>e handled 
+ . 

to at most. 

Setting K equal to zero gives the special case of the rectangular 
distribution, and the expressions for (x) reduce to the forms 

■fn ] ’ 

and 

Carrying out the indicated operations, these become 

2 - *] - - - -] 

and 

This last expression is the one usually found in the literature, and 
it was originally developed by Laplace as the limiting form of an 
urn problem. 

Setting K equal to plus one or to minus one gives either of 
the extreme cases of the “right triangular distribution** For K 
equal to plus one the expressions for M reduce to 



and 



254 


THE nXTBNDBD PROBABlUTy THEORY 


The function (k) normally has no direct practical application, 
but it is of interest to see its trend with increasing values of r \ . 
There are shown on Figure 1 several members of the family of 
curves originating with the rectangular distribution, and on Figure 
2 corresponding members of the family originating with the right 
triangular distribution. In both figures the interval a has been 
taken as unity, and there have actually been plotted the curves 
y« . This change in variable places all curves to a 

common base, and at the same time preserves the property of the 
total area under each being unity. 

For the sum of n selections the practical worker wants to know 
the minimum value ; or the maximum value x" ; or, most often 
of all, the shortest interval to aj” associated with a certain 
probability value. The probability that the sum of n selections 
will be less than is given by 

and the probability that the sum will exceed x" is given by 


dx. 

XT • . A 

Noting that f tx^-ba'] dx^f baj and that 


.no na-ba . ^ 

/ [na-ba-x]dx;=:/ [nQ-ba-x]dx;«-~«[na- ba-x"], since the 


bracket members are assigned the value zero for values of x which 
render them negative, there may be written immediately 

and 



H P LAWTHP.R, JR 



2S6 


THE EXTENDED PROBABILITY THEORY 



FIGURE -"Z 

ORAPHJ or curyes ynf.Cnn) fw R'®”’'' trianoular distribution 






n p I. in TUF.R. JR 


z^l 


Foi X - na or for 0 these last expressions lespectively should 
equate to unity, the total area under any probability curve. It is 
Simple to verify that they do .so, and the demonstration will be 
given for one of them For the interval O ^ a the expression 
for concludes with the term containing Let there 

be added and sulitfaoted a tenrl involving to give 


n-1 
nd. 




fnWdn(~ 

From inspection it is seen that this may be written 




ap 



where <^| (x") is the expression introduced previously, ^pon car- 
rying out the operations it is found that - ~ ) ^ 

ft'P It follows therefore that 




a 


n~l 


i-K + ~ ) [”-^''1 .for O^x'^a 
ap / 


and it is apparent now that for - O thus expression is equal to 
unity. Thus it is seen that the function also possesses an 

end-for-end s>iTimetry similar to that of -f^(x) The complete 
expression corresponding to Fn equal to unity instead of 

zero, however, and where the desired function is represented by a 
partial expression it can also be equally well represented by one 
minus the remaining terms of the complete expression. 

Refenmg to Piguie 3 it is ^een that the sum of aiea A 
plus aiea lb or (x')-*- gives the probabiht} tlud 



iMgiuc 3 



258 


THE EXTENDED FROBABILITY THEORY 


the sum of n selections lies outside the interval X' to x" » In 
an actual problem usually either the sum of A and B would be 
assigned and it would be desired to find the interval x' to 
associated with this expectancy^ or the interval x’ to x'^ would 
be assigned and it would be desired to find the expectancy asso- 
ciated with these limits. With of the character shown in 

Figure 3 and with the sum alone of A and B fixed, any number of 
pairs of values are possible for x* and x" . It is also clear that 
the length of the interval will depend upon the relative mag- 

nitudes of A and B. There are two special cases, however, which 
cover all normal demands, It is seen at a glance that the interval 

- x' will be shortest for a given A plus B when = (x‘). 

Purely from the standpoint of deviations, this shortest interval 
represents the optimum results of which the group is capable. 
Where the absolute magnitude is of primary concern it might be 
specified that 

The function which is of final interest, then, is represented by 
the sum > subject either to the restriction that 

fx") or to the restriction that na- . For the 
special case of the rectangular distribution f is symmetrical 

about the line x = and the two restrictions imply the same thing. 

A 

The symmetry of the rectangular distribution permits giving for- 
mal expression to the sum a function of (x-X*) 

when na-*x" . Under this condition the sum becomes equal 
to 2 (*x*) and x’ may be written as and there is 

obtained 



H. P. lAWTHBR, JR. 


259 


For linear distributions other than the rectangular this simplicity 
of expression is not possible. 

On Figures 4 and 5 are shown graphs of the curves y = fn 

for several values of n for the rectangular and for the right trian- 
gular distribution respectively. Here again the interval a has 
been taken as unity and change in variable has been made to place 

all curves to a common base. The values of Fp ( n x)nia.y be read 

directly from the same curves, since f n • x;"} = 1 - fn for 

corresponding values of X' and x" . Finally, on Figures 6 and 7 

are shown curves for the sum {x")\ plotted as a func- 

tion of f 1, subject to the restriction that for 

several values of n for the rectangular and for the right triangu- 
lar distribution respectively. The values for Figure 6 were com- 
puted directly from the formula given in the paragraph above. For 
Figure 7, however, the values were derived graphically froni Fig- 
ures 2 and 5. 

Figures 6 and 7 are applicable immediately to practical prob- 
lems. As a simple example, suppose there is at hand a group whose 
individuals are known to lie within the limits of D and D+a and 
to follow a right triangular distribution with the larger probability 
associated with the larger limit, and it is desired to know for the 
sum of eight selections what limits may be expected to be associated 
with a probability value of 0.01. Referring to Figure 7 it is seen 
that the curve for n = S reaches an ordinate value of 0.01 at an 
abscissa value of approximately 0.45. Referring now to Figure 2, 
the distance 0.45 is fitted in between the two legs of the curve 

lor n = 8 , and values for -q and ■g' of 0.43 and 0.88 are found. 

Consequently it can be concluded that for sums of eight selections 
from this group the probability is 0 01 that the values will lie out- 
side the interval 8D+ t3.44a to QD + Y 04-a , . 



260 


THE EXTENDED PROBABILITY THEORY 



FIGLIRC-"-* 

ORAPM3 or CURVtS y » r<„ Cn <) rOH RCCTA^lOUUAR OISTRIBUTrOM 



FiauRE-a 

qhafms of curves y forhiqht triahouuar, Distribution 



H. P. lAWTHER. JR, 


261 



OAAPHS Of CURVtS Of [f,t«> • VI, fOW ftecTAMCWLAR DISTRilWTIO** 

r0« COitfITION THAT 



r«URE -*7 

«AAfn^3 or CURVES Of Vl [-^*^3 TRIAHOULAfi DISTRIBUTION 

fOR CONDITION THAT /,tx1 • {,(*,*) 



262 


THE EXTENDED PROBABILITY THEORY 


While this study has been concerned primarily with the linear 
distribution, it is obvious that the results may find occasional wider 

application. The curves for (?c) , are of a charac- 

ter suggestive of distributions that might occur not infrequently 
in engineering and physics. If any one of them, say y is 

found to fit the group at hand with practical accuracy, then the 
sequence clearly will give the distri- 

bution curves associated with the sums of one, two, three, etc., se- 
lections from this group. 



ON THE ELIMINATION OF SYSTEMATIC ERRORS 
DUE TO GROUPING 


By 

John R. Abernethy 

In the calculation of the tnoments of a frequency distribution it 
is often desirable, or even necessary, to consider not the distribution 
itself but another derived from it by certain groupings. As a first 
approximation to the moments of this original distribution we take 
the corresponding moment of the grouped distribution. But this first 
approximation is not satisfactory, and it is necessary to obtain some 
method for the elimination of part of the error committed in replac- 
ing the moments of the original distribution by the corresponding 
moments of the grouped distribution. 

This problem was first discussed by W. F. Sheppard in a paper : 
On the Calculation of the most Probable Value of Frequeficy-Con- 
slants, for Data arranged according to Equidistant Dimsions of a 
Scale} If we denote the n-th moment of the original distribution 
by jui^ and the n-ih moment of the grouped distribution by , 
we will have Sheppard’s corrections in the form : 

^2.° ^ “iZ ' 

As pointed out by Karl Pearson* the hypotheses under which 
these formulae have been obtained are: (a) that Taylor’s theorem 

1 Proceedings London Mathematical Society, Vol. 29, p. 353-380. 

2 On an elementary proof of Sheppard's formulae for correcting raw 
moments and other allied poinis, editorial in Biometnea, Vol 3, p. 308-312. 



264 


BUMINADON 01 SYSILMATK ERRORS: 


I applied to the fiequcacy function throughout the tan^e ; {h) 

IS finite ancUontinuous throughout the range , (c) f(x) 

and its derivatives vanish at the limits of the range These hypothe- 
ses are not always satisfied l)y the frccjucncy functions with which 
the statistician has to work , and as it is impossihle to tell before cal- 
culating the moments of a distribution whether the cen responding 
theoretical frequency function satisfies these conditions, it is desir- 
able to study the ]^u>blem fiom another standpoint 

A comparison of the title of Sheppard’s paper and the paper 
itself suggests the question, in what sense do Sheppard’s foimulae 
give the most proliable value of the moments of a distribution’ A 
partial answer is given by 13, L, Shook in the Synop^i'^' of Elementarv 
Mathematical Statistics^ Miss Shook piesents** the formulae 

>L, = v; = 0, = = for a discrete dis- 

tri])ution with m values of the variable giouped in each class interval 
and shows that for a particular distribution these formulae serve 
to eliminate the systematic enors from M , • "Two 

problems are suggested by the synopsis: the derivation of formulae 
for the class of discrete distributions, as these three formulae are 
stated without proof the proof that this larger set of formulae and 
those of Sheppard do serve under all conditions to eliminate the 
systematic error due to grouping, subject only to the existence of 
the moments involvea. \Mien we have solved these two problems, 
we shall be in position to understand the true nature of the approxi- 

3 The Annals of Mathematical Statistics, Vol 1 , p 34-40 
These formulae are only special cases of a more general formula stated 
by H. C. Carver m an editorial Annals of Mathfmaiical Statistics, 
Vol. 1; formula (14), p. 111 

® Two methods of developing this formula suggest themselves: (a) the 
elimination of the moment of a continuous graduating function expressed in 
terms of fine groupings of class intervals of on the one hand and in 

the terms of coarse groupings of unit class intervals on the other, (b) by a 
process similar to that of Sheppard, employing for example Lubbock’s form- 
ula instead of the Euler-Maclaurin sum formula. According to a statement 
made by Professor Carver, the formulae in question w'ere derived by the 
latter process. 



JOHN ye ABERNETHY 


265 


mation involved in employing Sheppard's corrections and correc- 
tions similar to them for discrete distributions. 

The problem we wish to consider is this Gi\en the probabilities 
that a value of the statistical variable x taken at random 


ill fall within the interval X' < x < , we wish to find 

the moments of the distribution We consider this problem for two 
classes of distiibutions * the distribution of a discrete variable ; the 
distribution bi a continuous variable. In either case we shall work 
with the uni-frequency distributional function f (x) . For the diS” 


Crete distribution f ) represents the probability that a value 


taken at random will be the number ^ ; m denotes a definite 


m 


of X 

positive integer, ( ] *r 0, 2^ * . For the continuous 

.b 

distribution / f represents the probability that a value of x 

taken at random will fall within the interval a<x<b Thus the 
function f (x) has the value zero outside the range of the distribu- 
tion and we may for the sake of convenience denote the limits of 
summation and integration as ± (>3 . For the n4h moment about 


the origin we have. 



for the discrete distribution , and 

- (X3 

for the continuous distribution. What we want is the value of 
^^'hat wc are able to find is the value of 

v'=.Z] 



266 


IllJiMINA'/ION 0 /' SVSrEMAnC ERRORS 


In establishing approximate relations between the set of true 

moments [Mh] moments shall 

employ another set of statistical constants ■ . For the discrete 

distribution there are m distinct sets of groupings that can be made, 

.1 

leading to m values of the raw moment is used to repre- 

sent the average of these. Similarly for the continuous distribution, 
is used to denote the average of the moments correspond- 
ing to AS I = 1+1: for all values of t satisfying Qsi< 1. We shall 

call this intermediate set of statistical constants the average 

grouped moments of the distribution. We then divide the problem 
into two parts. First we seek the expression of in terms of the 

' "V/pj . Secondly we seek the nature of approximation in replacing 

by . The first of these can be solved completely without 
approximation and without any assumption other than the existence 
of the moments involved. We can best understand the nature o,f 
the approximation involved m the second after the first of our two 
problems has been solved. 

The m values of corresponding to the m distinct methods 
of grouping a discrete distribution are given by 


1 min m-l , 1 

V,Cf)=Z] a+t+^) 

[S-OO J*0 




JOHN R. ABERNETHY 


267 


We shall first express the average grouped moment in terms of 
the true moments then solve for the in terms 

of the I . We wish to arrange the right hand side of the above 

equation according to values of the argument % appearing in f (x) ; 
we therefore let 5 mi+k + j . This equation then becomes 



zm 


from which, by means of the binomial theorem, we obtain 




W^c therefore have 


where 



We shall sometimes write instead of bj (m) in order to simplify 
the expression of an equation The change of order of summation 



268 


hinilXA'lIOK 01 SYSlhMAlIC ERRORS 


lb l)abe(l on the absuniption that the m summations (t) converge 

absolutely, an assumption equivalent to that of the existence of 
since i(x,) has only positive oi zero values. 

We see immerliaiciy fiom (2) that fnn) =: 0, vsince the 

terms of the summation cancel each other in paiis, with the possible 

addition or a middle term equal to zero. The calculation of 

may readily be eflectc''! hy means ot tne Euler-Maclaurin sum 
formula” 


m-i rn oo 

^-g0tj;)=y g(tjd+ + S 

J=0 l, = i 


PiL— 

_4^ (Zi)\ 


(Zi-i) 

g 



wliei e 


3^0 = 

i > 

0^ = 

i 

''5' 

^4 = 

7 

15 ’ 


21 ‘ 

•px 

127 

Dg = 

15 ' 


We substitute 


® See for example NorlumPs Differenyenrechnung, Berlin 1924, espe- 
cially formulae (39), p 27; (42i, p. 28, and (49), p. 30. Formula (39) is 

for 


From tliiv we nia\ ol)Lajn the \aUievS of D^- and show that D-. .=0 : also 

/Sc 

wc obtain )D 5 : =0 tor n >0 which we shall employ in the 

1*0 


proof of our formula (7), 



JOHM R, ABERNBTHY 


269 


obtaining 


(3) 




p. /£K+l -\ 

4V2k+13 ho Wi y 


The first few values are : 


bo(m) = l, 






10 


). 


b, (m ) = ;4 ~; ~ 7 

1344 


21 49 51 


(5-^. 

m 


m' 






60 294 ^ 


m 


381 


m' 


). 


A control on the values of bj (m) may be obtained by substituting 
m= 1 ; then all except vanish as b. ( 1 ) = 0 , for i > 0 . 

Having in (1) and (3) obtained the expression of the average 
grouped moment of a discrete distribution m terms of the true mo- 
ments, we wish to solve for the true moments in terms of the average 
grouped moments. We shall obtain this solution by the method of 
undetermined coefficients. Let 


(4) 


, I 


n 


Mri - ■? 

^ J = 0 




Substituting this in (1) we shall have 



270 


ELIMINATION OF SYSTEMATIC ERRORS 


from which we obtain 
,1 ^ 




by a change of order of summation effected by applying the Dirchlet 
suiri'forniula’^ 

n n-i. D n-j 

S S tl VA/fLj). 

Uo j=o j=o i-o 


Equating coefficients of Vj gives us the recurring formula 
k 

E ( , ) w L “ ^ k > 0 , 

1=0 

together with the initial condition 1. This may also be written 


( 5 ) 



\<ll\ Ao = i. 


Ordinarily in an expression such as J^4) we would have written 
fn) instead of ; had we done so in this case, we would 

now drop the functional expression as we have shown that the value 
of (n) depending only on n-j is completely independent 

of n . The coefficients A are also independent of the position 
of the origin since if in 




^ For the method of derivation of this formula see, for example, Steffen- 
sen^s Interpolation (Baltimore, 1927), p. 91-92. 



JOHN It ABERNmHY 


271 


we substitute 




I 


n-i’X 


n-L 

j=0 


nn 

J 




J’ 


J 


we shall have 


I 

^n.)C+ h 



and hence 


Mtt:;6+h ^ J ^n-i'x;+h 


If in (5) we substitute k = 1 we shall obtain /A ) - " b, ^ 0 . 
Moreover in general since by induction if 

the terms of the summation (5) will have respectively the zeio 
factors 

^z^-1 ' ^3>-' 

Also from (S) we obtain: 

A_j= - 

A^=.b4 + 6('b;^)^ 

Ae = 'b6 + 30b^b^-90('b^)^ ^ ^ 

AQ--bQ+ 5& bjzb^ + 70(b^)^-lZ60(b^) b^+ Z5Z0(b^) . 



272 


tlJMlKAllON 01' SYSTEMATIC ERRORS 


An observation of these expressions of the As suggests the formula : 

■ (CZj)'r‘ (a,' )fQ^'. ). .. (aji ) 


where U = +• . . cij , the summation extending over all 

positive integral oi zero values of aj satisfying 

-+• *4 ja j = I . That this formula holds in general may be 

proved by induction: assume it true for l « 0, - 1 and 

substitute in (5) for l< = 2j . Upon collecting terms according to 
products of the b’s we shall have established this formula also for 
i a j , and hence for every positive integral value. 

If m the expressions of the A*s in terms of the b's we sub* 
stitute the values of the b’s in terms of m» we shall obtain the 
expression of the A^s in terms of m , Thus we have 




A 


1 


). 




m' 

10 


( 7 - + 




z 

li 




A.- 


4361- 


620 


m" 

294 




60 


)• 


^6 11520 ^ m' 

A comparison of the values of A^with b^>,'of A^with b;j,of 
with b^,of A^with b^, and of A^with shows a remarkable 
similarity between the coefficients in A^^and those in b^* ; in fact 

we observe that 




m 


Zl 


^;iL(m)- Substituting ^ for 


m I in (3) and dividing by we obtain 



JOHK R JBERXnTHY 


273 


(7) 


(m)- 


1 

4-‘^C2k + i; 


£ I'^k + IN Pzk-zi 
i.*o n2l+ 1/ m^i- 


In order to prove that (7) is true in general, we assume it true up 
to a certain point and prove it true for the next highest value of k . 
That is we assun'ie 




m 


Zi m) L = 0. 1, k’l 


and substitute in 


/A 


ZK 



^Zk-Zi '^Zi 


another form of (5) since" ~ 0) we have 


and 


i . 4 D 

h r y f zk-zi-i-lx -^z.1 

ZH-Zl (Zk- Zi-i-l) j.Q ^ ' m^J ’ 


d 


^Zl 


4^21+0 


2 (T) 


21+1 1 ^Zr 


r=o 


m 


Zi-Zr 


After tins substitution we arrange the terms of according to 
powers of ^ obtaining 



2k+-l 




1 

(Zk-Zs-^l) 



k-Zs-t'l 
Zr 


> 


zr 


) 


where 6 - i - p - j , When s=: 0. 1, k-1 the summation 

extends from 0 to r = k- s for j 0 , but from r = O to 



274 


hUMINATION OF SYSTEM AlIC ERRORS 


r-k-s-lfor j = 0. Since 
k-s 




for a < k , the summation as to r gives zero for j ^ O but 

-(Z\<-Zs + l)D^^_^^, forj=0. 

At the same time for j =0 , the factor equals unity 

and we have the desired terms for a= 0> k- 1 • 

For k we have constantly r =i 0 , the summation as to j be- 
ing from j = 1 to j » k ; we therefore have 




at the same time that 


- ^k-2s+lv 

5.( .n K'-i- 


(Zk-Zs^l) Zr’ 

We therefore come again to formula (7) with I replaced by a . 
Hence formula (7) is true for every positive integral value of k . 
The first few values of the As have been calculated in (6), others 
may be easily obtained by substituting the value of the Eulerian 
numbers from some table of T)2i 

Formulae (4) and (7) give us the expression of the true mo- 
ment 




of a discreet distribution in terms of the set of average grouped 
moments 


Norlund, loc. cit,, Tafel 4, p 458, gives the value up to 



JOHN R. ABERNBTHY 


275 


t t » 

^ j«-oo ''T^ 2m/^(m/ 2m/ 


Employing the particular values given in (6), we have the formula 

M„ ^n'-2 *k 

~ 1344^ & *' fr*" )'^0-6 


A \ rnci4 6ZQ . oQ 5 y I. 

li 520 ^ ® ^ n-5 


For any particular value of n , this series terminates and we may 
therefore apply the ordinary theory of limits to (8), Thus we 
obtain 


( 9 ) 


~ iziTj^n-Z'*' Z40 



31 

1344- 


rjK 


. 

6 ■^3840 



4*otf 


the expression of the true moment « f i'>c)<X)C of a 

- CkJ 

continuous distribution in terms of the set of average grouped mo- 
ments 


M 


/ ^ (>^+ 

- 00 


j )dz =■! x''(p (x.) d4 . 


We have thus completely solved the first of our two problems ; we 
have obtained the expression of the true moments in terms of the 
average grouped moments without any assumption other than the 
existence of . The existence of requires the convergence 



276 


ELIMINATJON OF SYSTEMATIC ERRORS 


of the summation or integration as the lower limit approaches 
and as the upper limit approaches 4 cxo independently. 

If m (8) we replace 


'Vf 


+c>a 

-L ■ 

j=-<» 


m (- + 

m V m 


m 


^ W ^m / 


by 


j = -00 ^ ^ 


we will have the general Sheppaid-Carver formula. Since there is 
no approximation involved in (8), any error m the Sheppard-Car- 
ver formulae must be a result of the error involved in replacing 
the average grouped moments "V; by the raw moments v/ * . By 
definition V* is the average of the 


- 1 L 

“Vj f 1 ) « C ^ t 

j=-«> 


and, therefoie, if we take any particular grouping at random, 

V(' = "£ 


is the mean of a random sample of one from the parent distribution 
(i) and hence the most probable value of The Shep- 
pard-Carver formula, therefore, gives the most probable value of 
the true moment > 1 ’^ of a discrete distribution in the sense that 
these formulae eliminate the systematic errors due to grouping. 

Similarly, we shall obtain Sheppard’s corrections if in (9) wc 



JOHNR ABERNETHY 


277 


replace 

+« 

=y* a; 0(!c)d)i , 

‘oo 

by 

I L 

v[ = J: (zp (pC^C:). 

j=-w ■* 

These formulae give the most probable value of the true moments 
M for a continuous distribution in the same sense as do the 
Sheppard-Carver formulae for a discrete distribution. 

The Sheppard corrections for continuous distributions and the 
Sheppard-Carver corrections for discrete distributions give the most 
probable value of the true moments | of a distribution f (x;) 
in the sense that they give an approximate value for which is 
correct on the average. That is these formulae eliminate the sys- 
tematic errors due to grouping whatever the distributional function 
■f ('X>) so long as the moments under consideration exist. While it is 
true that the accidental errors not accounted for in these corrections 
may not be negligible, these formulae do give the most probable value 
of ja*^ for a particular grouping and hence have a basis for uni- 
versal application. 



ON MULTIPLE AND PARTIAL CORRELATION 
COEFFICIENTS OF A CERTAIN SEQUENCE 
OF SUMS 

By 

Carl H. Fisciter 

In a recent paper’*' the writer considered a sequence of q vari- 
ables defined as follows; The first variable, , is defined as the 
sum of Hj values pf a variable, t , drawn at random from a pop- 
ulation characterized by a rather arbitrary continuous probability 
function, f (f) . Each succeeding variable, (i >1), is defined 
as the sum of ^ values of t drawn at random from thp 
values composing , plus the sum of ^values of t drawn 

at random from the parent population, 

For variables thus defined, it was proved that the correlation 
coefficient l^etween any two consecutive sums, is 

independent of the probability function, f (i) , and is given by 


(I) 


p 






Vi 


It was further shown that the correlation coefficient between 
two variables not consecutive in the sequence is equal to the pro- 
duct of the respective coefficients of correlation between all inter- 
mediate pairs of consecutive variables. Thus, the coefficient of 
correlation between and , (" j < p)^ is 


or, in a simpler notation, 


( 2 ) 


'"jp “ J+i' '"j+Li+^. ’ ' ' ■ p-l‘ '"P'l, p ‘ 


* On Correlation Surfaces of Sums with a Certain Number of Rapdom 
Elements m Common, Annals of Mathematical Statistics, Vol. IV, pp. 103- 
126. May, 1933 



CARL H FISCHER 


279 


Let us now determine the multiple and partial correlation co- 
efficients existing among a sequence of variables thus defined. 

Consider the fundamental symmetric determinant, P , which, 
with its various co-factors, appears in the standard formulas for 
multiple and partial correlation coefficients.^ ’ If we substitute for 


each , ( j 

1-1,1, u 

1), in -p 

, its equivalent from equation 

(2), we have 








1 



p r 

IZ ^23 

r p * 
iZ Z5 

hM 




1 

"za 


'2354 



'Z3 


'z3 

1 • 




(3)Te= 

p r r 

1Z Z5 M 

r r 



' ^45' 



r r • 

■ Hq-i 

r ' ' r 
q- 

Z,q-1 


i ■ 



r r 

Z3 


p . . p 

'Z3 q- 




i 


Multiply the second row of P by and subtract it from the first 
row. Now multiply the third row by and subtract it from the 
second row Continue this process, multiplying the j-th row by 
r, 1 1 and subtracting this row from the (j*-l)strow, until all 
possible rows have been so treated. Equation (3) may now be 
written 

1 H. L. Rietz, ^‘Mathematical Statistics’’, Carus Monograph No. 4, pp. 
94400. 



280 


OX Mril fPLE AND PARTIAL COEFFICIENTS 



0 

0 

0 

0 


(i-q-;; 

C) 

0 

0 


'23^ ' 


0 

Q 


’za ‘as ^ 


0 

0 


la 

'll iq 2,q-lCl''q-i,q' 




^34- 

Cq-l_q i 


The expansion of *12 may now be readily accomplished. The" 
application of this same method of procedure to each of the various 
T?^j , (where is the co-factor of the element yields 

without difficulty the results made use of in the remainder of this 
paper. 

A. Multiple Correlation Coefficients. 

The formula for the multiple correlation coefficient of one 
variable on the remaining cj-1 variables is 

1 + 1 , ■ 

From equations (3) and (4) we derive the following expression'?' 
for the necessary 12 and 12^ j , (j = i, 2,,3. • • • q). 




CARL H. FISCHER 


2gl 


( 6 ) 




(‘-iSXl-r^'Xi-sD (l-rJ.-z.q-iXl-r'.i,,); 


'^ 22 - 

'?ij-(i-i:JXi-';j*)- ■(*-'’j'2,j-i)(‘-''j-i,j''tu)f'-'-%i,i*2}' 


■' ^ q-l) q ) » 


■ • • ‘ (i''’q^-3,q-^)(^-^q-2,q-l)• 

Upon substituting the proper values from (6) in formula (5) 
for the multiple correlation coefficients of the first and last varia- 
bles, respectively, on the others in the sequence, we find 


(7) 

Z34 

II 

(8) 

•q- 123' 

•q-l= iq-l.q 



282 


ON MULTIPLE AND PARTIAL COEFFICIENTS 


The multiple correlation coefficient of any other variable, , on 
the remaining cj-i variables is given by 

1^1)1 


q 


(9) 


,-,1 

z 






ri.j 

It is to be noted that the right member of (9) is independent 
of all of the simple correlation coefficients except and - 


B, Partial Correlation Coefficients. 

The formula for the partial correlation coefficient between 
any two variables is 

''lj.iZ34- --q ^ 

From equation (3) we derive the following expressions for the 
co-factors of elements other than those of the principal diagonal 
of the determinant. Because of the symmetry of the fundamental 
determinant, we know that expanding the 

co-factors of the elements of each row, we shall consider only the 
13^j where t s j . 

1. The co-factors of the elements of the first row are 

■Rl^ = 0. (1=3, 4, 5, •■■•q). 


2. The co-factors of the elements of the second row aie 

(12) '^23“ ^^ 2.3 (^■'34 )• (^'^q-2.,q l)(i- 

( 1 ,= 4,5,6, .q). 



CARL H, FISCHGR 


283 


3. The co-factors of the elements of the j-th row are 

( U ) - C ^'^2 

^ji. =0, CUj+2,, j+3, q j 

4. The co-factor of the last element of the q-ih row is 

(14) * 

We see at once that all partial correlation coefficients between 
non-consecutive variables vanish, as each co-factor T2[j - 0 if 
,j , j+i . The non- vanishing coefficients, those be- 
tween consecutive variables, are given below. 

(15) r =r I 

'' '' 'i2-345--(J hz 


(16) f3,j+i‘l234- q“ n,j4-l 




■1^ 


'q-l.q-i234-”-q--S “ 'q'l.q 






From (16) we can state that in general the partial correlation 
coefficient of consecutive sums andxj^^ is independent of all 
simple correlation coefficients except j > r* and • 



284 ON MULIIPLE AND PARTIAL COEFFICIENTS 
C. SiiimmN 

To bummariTc, we have shown that 

1 The nniltiple correlation coefficient of a variable Xj 
on the remaining variables ot our sequence is inde- 
pendent of all simple correlation coefficients except 
those between and the immediately preceding and 
the immediately following '’ariables, respectively 

2, The partial correlation coefficients between all pairs 
of non-consecutive variables in our sequence are zero ; 
a result that appeals to the intuition when it is recalled 
that we are eliminating the effect of the variables that 
form the connecting links between the two under con- 
sideration. 

3. The partial correlation coefficient of any pair of con- 
secutive variables, and IS independent of all 
simple correlation coefficients except those between 
the two consecutue varialiles in question, between the 
first of these and the variable immediately preceding 
it, and between the second of the paii and the variable 
immediately following it. 


a/ - 



AN EXPERIMENT REGARDING THE TEST 

By 

Selby Robinson* 

L Introduction 

R. A. Fisher has proposed that in case the hypothesis being 
tested has been partially obtained from the data, the Elderton table^ 
for X should be entered with n* equal to, not the number of fre- 
quency classes, but the number of frequency classes minus the 
number of statistics computed from the data. It has been proved 
under certain restrictions that this theory holds in the limit as the 
size of the sample approaches infinity.^ 

For samples of moderate size oitr only guide is experimental 
evidence, which indicates that Fisher's method is satisfactory in 
practice.'"* This is true in particular of the evidence presented m 
the present paper which describes a coin tossing experiment sug- 
gested by Professor H. L Rietz. 

2. The Experiment 

The experimental work here considered was done by seventy 
students each of whom tossed seven coins 128 times,’ In any 
one of the seventy experiments, this results in a frequency distri- 
bution of 128 items divided into eight frequency classes But we 
lumped together the dashes of zero heads and one head and like- 
wise for six heads and seven heads, so that we had six frequency 
classes, If for every coin on every throw the probability of heads 

* National Research Fellow. 

1 Karl Pearson, Tables for statisticians and hiometriciaiis, (1914), 
Table XII. 

Neyman and E. S, Pearson, Biometrika. V. 20A (1928), pp, 263- 
294 

3 Yule, Journal of the Royal Statistical Society, V. 85, pp. 95404; 
Brownlee, ibid*, V 87, p 76 ; Neynam and Pearson, Toe. Cit, ; Sheppard, 
Phd, Trans., A. V., 228, p. 115. 



286 


AN EXPERIMENT REGARDING THE TEST 

is one-half, the expected numbers m|^ in the six frequency classes 
are 8, 21, 35, 35, 21, 8. The divergence of the actual distribution 
Hi , , from the theoretical one is measured by 



If we had possessed only one sample of 128 throws, we would have 
looked in Elderton's table with the value of X and with n' == & 
(the number of frequency classes) to find the probability that 
a sample would by chance deviate from the expected distribution 
so much as the actual sample had deviated. But having seventy 
samples, we compared the distribution of our seventy values of X 
with that expected from n - 6 . The arithmetic mean of our 
seventy values of X ^ was 4.62 whereas the expected value was 
five ; a deviation which could very well occur by chance. So our 
results are consistent with the hypothesis that the probability of a 
head is always one-half. 

We considered next the following composite hypothesis : the 
probability p of heads is the same for all coins on all throws. For 
any sample of 128 throws, we took as the estimate of p the actual 
proportion of heads among the 7 x 128 possibilities. From this 
value we calculated the expected frequencies, rn^, . . For 
each of the seventy samples, we computed 

L=s6 9 

' Ui 

When we used the X^test to compare this distribution of 'X.f's 
with that expected for n ^ 0 » we found that : 01- But 

when we compared our distribution with that expected for 
n*=.6-ias5 we found that P^, = .6^, The mean value of our val- 
ues of was 3.97 compared with 6 - i- 1 = 4demanded by Fish- 
er^s theory, and with five by the theory that is distributed aa 



SELBY ROBWSOH 


287 


If the latter theory were correct the probability of the mean 
of seventy values of being so far away from five, is 007. 
That our distribution of corresponds to that expected for 
n' = 5 whereas our distribution of the values of corresponds 
to n' = 6 , can be seen from the following table.‘ 


alues of X ^ 
(or Xf), 

Observed 

frequency 

ofX^ 

• 

Expected 

frequency 

n'»6 

Observed 

frequency 

ofX^ 

Expected 

•frequency 

n’«5 

0-1 

3 

26 

7 

6 3 

1-2 

8 


12 

12,2 

2-3 

11 

■H 

14 

12,5 

3-4 

14 

10 5 

12 

10,6 

4-S 

12 

9,3 

10 

8.3 

5-7 

11 

13 7 

7 

10,6 

7™9 

5 

7,8 

3 

5.2 

greater 
than 9 

6 

7,6 

S 

4.3 


^ In computing we combined the first two classes of this table 
and also the last two, thus making n' » 6 . 






ON SAMPLING FROM COMPOUND POPULATIONS* 

By 

GEORGE MIDDLETON BROWN 
I'ntrodnctxon, 

The decided asymmetry or the multimodality of certain fre- 
quency distributions may have prompted the idea of the possibility 
of the existence of frequency curves, apparently single in charac- 
ter, but which, on further investigation, might be shown to be 
actually composite. In other words, apparently homogeneous ma- 
terial may prove to be heterogeneous, or divisible into two or more 
distinct homogeneous groups 

The above ideas lead naturally to the problem of dissecting a 
compound frequency function into its various components. Karl 
Pearsoid successfully solved such a problem, using the method of 
moments, on the assumption that the compound parent population 
was composed of two normal components Each component curve 
has three parameters, the mean (or position of axis), the standard 
deviation, and the area (or total frequency). One requires there- 
fore, SIX relations between the parameters of the gnen compound 
frequency curve, and those of its two components, in order to deter- 
mine six unknowns. The ultimate solution of the problem turns on 
the determination of the zeros of a nonic equation, the location of 
whose real roots is obtained, to successive approximations, by 
means of the so-called Sturm’s functions. 

The dissection problem was taken up later, first in a paper by 
Charlier,“ then in a joint paper by Charlier and WickselP who 


* A dissertation submitted in partial fulfillment of the requirements for 
the degree of Doctor of Science in the University of Michigan. June, 1933 

1 On the dissection ot frequency curves into normal curves Karl Pear- 
son. Phil. Trans. Roy, Soc. Lond Vol 185, Pt. I, pp 71-110. 1894A. 

^ Researches into the theory of probability. C. V. L. Charlier. Meddel- 
anden frau Lunds Astron Observ. Sec 2. Bd 1 1906 

3 On the dissection of frequency functions. C. V. L. Charlier, anti S D 
Wicksell, Arkiv. fur Matcmatik. Astron. och Fysik, (Medde)au4c‘) Band 
18. No 6 1923. 



CiEOKCE MIDDLE] ON DNOlj^N 289 

considerably simplified the theory, finally ai riving, however, at the 
fundamental nonic due to Peaison, for the solution of which, they 
suggested the use of a giaphical method. Tliey also studied special 
cases of the more general problem, e g the means of the two com- 
ponents assumed known, the compound curve assumed symmetri- 
cal, or the standard deviations of the two components supposed 
equal. Jn addition, they extended the problem to the case of fre- 
quency functions of two variates. 

In the present paper, i propose to investigate the sampling 
problem in the case of compound distril)ution functions, and trom 
a consideration of the dissection pioblcm, one is led to a division 
of the present investigation into two mam paits, for the following 
reasons. 

On the one hand, in sampling from a compound population, 
if we do not know the proportion contributed to the total frequency 
of the sample by each of the two components of the parent ]>opula- 
tion, we are essentially sampling from a single population* That is, 
random samples of N are drawn from a single comjxisite parent 
population made up of two components Hence, the previously 
obtained results for sampling from a single parent population will 
be available if we derive expressions for the ])arameters of the 
compound parent m terms of the parameteis of it^ components. 
This is done in Part 1. 

On the other hand, however, if we know the proportion con- 
tributed to the total frequency of a sample by each of the two 
components, the situation differs entirely from that studied in Part 
1, Here we are concerned with sampling from two distinct parent 
populations, and in Part 2, I develop a method for dealing with 
this problem. Thus, in Part 2, it is assumed that samples of p 
and 3 respectively are drawn from two distinct parent popula- 
tions, and these two samples are then combined to yield a sample 
of p-v -5 rN from the combined [xipulations 



290 ON SAMPLING FROM COMPOUND POPULATIONS 

Therefore in Part 1, we are essentially sampling from a single 
parent population, whereas in Part 2, we are sampling from mul- 
tiple populations. 

The developments of Part 2 yield some new sampling results 
for sampling from two parent populations. In Section 6, I derive 
expressions for the semi-invariants “of moments about a fixed 
point" in samples from the compound frequem function, in terms 
of the corresponding semi-invariants of the moments of its compo- 
nents. In Section 7, expressions are derived for the semi-invariants 
of “moments in samples from the compound population about the 
mean of the combined sample,’’ in terms of r ands, and the semi- 
invariants of the two components themselves. 

The occurrence of a certain class of well-known polynomials 
in the development of Section 1, is of especial interest, since these 
are, except perhaps for sign, the semi-invariants of the binomial 
distribution, and have some rather important properties, and their 
further study, although not pertinent to the problem in hand, should 
yield some very interesting results. 

Section S is devoted to the discussion of the case in which a 
limiting compound frequency function exists, under certain as- 
sumptions regarding the nature of its components, where the num- 
ber of the latter is allowed to increase indefinitely. This idea of a 
limit frequency function would appear to indicate the ixissihility 
of a new approach to the theory of frequency curves, in which the 
variable may now be a complete frequency distribution in itself. 

This investigation was begun on the suggestion of Professor 
C, C, Craig, of the University of Michigan, U, S. A. to whom I am 
indebted for constant inspiration and guidance during its pursuit. 



GEORGE MIDDLETON DROWN 


291 


PART 1 


Sectioyi 1. The semi-invona^its of the compound frequency 
function, in terms of the senn- invariants of its troo nonnal com- 
ponents. 


The main object of this first section, is to obtain expressions 
for the parameters of a compound population in terms of the para- 
meters of its two normal components, and to this end, I shall use 
the following definition of the semi-invanants of Thiele.^ 




f(x) 


e ci>^ , 


I write therefore 


■f(x)= p (P| (x)+ q (p 2 (x). 


in which f Cx) is the comjxiund frequency function, (x;), cj^(Ai)are 
its two normal components, and p+C]^ = i 


If L^, etc. are the semi-invanants of f(^), then 




(l)e 


Z A' 




where '^eans and standard deviations of 

cp^(A>),4^(j<;)respecti\ely. For convenience, I write 

We wish to express the m (1) in terms of the quantities 

P • q • 

Taking logarithms in ( 1 ) , 

1 Nwmerous references relating to the theory of scmi-mvanants mav he 
found at the end of “An application of Thiele’s semi-invanants to the sam- 
pling problein", C. C. Craig. Metron, Vol. 7, No 4, 1928, p. 7.1 



292 ON SAMPLING FROM COMPOUND POPULATIONS 


'-i'*’ 2-! '“3 3! 


( 2 ) 


= foq p + (rn^t + cJ^^ )+ foq (i+re. 2! ) 


We require now a suitable form for the expansion of the third 
term of the right member of (2) in successive powers of t . We 
have 

( 

"fog V i+ re / = foq (1+ r)+ foq 



Further 



The complete representation of terms of the type 



in the right member of (3) will be 




GEORGE MIDDLETON BROWN 


293 


Therefore, the coefficient of 


in the tight member of (4) is 


^ j 2. j - n > n-i • i 

n! a '' • b '* * 

i=C-r]- ^2j-n)!(n-j)l 


wheie j^~' 
integer in 


means the largest 

njT 
Z ' 


t 

Then, the coefficient of — in the right member of (3) is 

(5) L = E C ; • r — ' ;n>2 


and this is the relation sought ^ in which the semi '■invariants of the 
compound frequency function aie expressed in terms of the semi'* 
invariants of its two normal comjjonents Hclow, I have wu'itteu 
out in detail the expressions for to inclusive, 

*-i= m^+-aq. 

Lj= a ^pqq 3abpq . 

( 6 ) a'^pqq^+6a^bpqq^ + 5b^pq, 

1-5= a^pqq^f-i0d®bpqq^+-i5ab^pqq^. 

= a^pq^j^ + 1 5 a'*' b pqq ^ + 45 pqq^ -t- 1 5 p qq^ 

L-i= +4lQ.^bpqq^+ iOSo^bVqq j + .105ab^pqq^. 



294 ON SAMPLING FROM COMPOUND POPULATIONS 
Lg=a®pqq^+2,8a^bpqq^i-ElOa'^b^pqq^+4-ZOa*bVqq^4-105^pqq^. 
L^.a?pqqg+36a''’bpqqg+376a%Vqqj+1260a^fc?pqq^+945Qt3'pqq^. 
in which 

q^-i-Zq. 

q^-l-6q + 6q^. 

l-14q + 36q*- Z4q^ 

(7) 

q^. 1- 30q+ 150q^- ZAO ci ^+ 170 q'^. 

q^ = l-67q+540q^- 1560q3+ ^qq^a-, j^q^s 

qg-l-176q+18O6ql6400q\l6500q'^-151Z0q^+5O40q‘’. 
q^- i - 254q +5796q*'-40824q%l£b000q't 1915 iJOq'® 

+ l411Z0q'’-403^q7 

The expressions for theL^^ in (6) have two properties, which 
enable one to write them down readily. In the first place, assuming 

that the polynomials in q(orp«l'q^ are suppressed, i.e. q' = q. ^ 

equal to unity, then the 

resulting functions in “a” and '‘b'" are readily obtained by means of 



GEORGE MIDDLETON BROWN 


295 


a well-known recursion formula. Secondly, considering the poly- 
nomials as coefficients in the several terms of the original com- 
plete expressions for the in (6), for n , and arranging 
these expressions so that their corresponding terms appear in col- 
umns, the first terms in the first column, the second terms in the 
second column, and so on, then every term in any diagonal array 
proceeding from upper left to lower right, and consisting of one 
and only one term from each of the expressions (6), will have the 
same polynomial coefficient. 

I proceed now to obtain expressions for the in (6), in 
which the individuality of the polynomials , etc,, has 

been suppressed. This time I write 


( 8 ) 


foq(l 


+ re j = foq 




The term in 


t'in 


( bt \k . 


C2k-sj!Cs-l<)!Z®'*^ 

Rearranging the series in brace of right hand member of (8) 
in successive powers of t , 


foqUtre 2"-"(Zk-s)l(s-k)l 


( 9 ) 


= ■foq(i+-rJ+ toq 


X 0 / 



ON SAV^UNC COMFOl ND POPULA } }ONS 


where 9^^ - ^Ai 1,2., 5, > * . 

4-S ^ 

Therefore, the coefficient of ji in the series In the hrac< »ii ihe 
right member of (9) is 65 , where 


( 10 ) 


s 



Z^-'< (2k-5)!(5-k)l 


From equation ( 2 ) and (9) above, we have 




( 11 ) 


= ioq P+- foq(i(-r)+(rn^t )■*- foq(l+ 0 jt+^ +■ 

Therefore, equation (11) becomes 


(12) L^t+L^|-i + '- =(mj_t+cs^^ ji) 4 - foq(l-t-G^-t +9^^ + • ' ' ) ■ 


One might note, in passing, that in (12), the 0 5 are playing the 
role of moments, if one recalls the definition of semi-invariants, 
so that it would be possible to write down a second general expres- 
sion for the L 5 , using the well-known foimula for semi-invariants 
in terms of moments. 

The first six 65 take the following form 

Ql == 

Q^-‘Ci(a^+b) 

%= 

0^ = q (a'V 6a^ b + 3 b^) 

©5 “ q(a\l0a^b + 15 ab^) 

0 /: = a ISa'^b + 1 '4 



GEORGE MIDDLETON BROWN 297 

If, in the last set of relations, we set q = 1 , we then have 
and if in the expressions (6), , etc., he all .set equal 

to unity, we shall get, for n > Z , 

L„=^3n. 

I shall now show that the & follow the recursion law 


( 13 ) 





Kow, putting we have 




and in general 

(14) = (b, a+bt), 

dt 

where (^^y) is ^ polynomial of degree in ^ and y . 
It IS easily shown that 


(15) e 


md 


'di 


■p(b.arbt) 


r [■Gfb,a+bt)'e^‘'^^ 


t-o 


da t 


t-0 



298 ON SAMPLING FROM COMPOUND POPULATIONS 


and that 

d ‘P(^) 

(16) 




m 


t=0 


t=0 


Now deriving the left member of (14) with respect to t , and then 
setting t=0 , gives the nextyS , namely definition, whilst 

the derivative of the right member of (14), and setting i^O, 
would equal the sum of the right members of (15) and (16), which 
establishes the proof. 

The second property of the expressions for the in (6), 
which requires proof, may be stated as a theorem thus — “The k-th 
polynomial coefficient in the expression for the semi-invariant 

, m>i» is identically equal to the polynomial co- 

efficient in the expression for the semi-invariant 

For simplicity, I have considered the first and second poly- 
nomial coefficients of and respectively, the proof going 

through in exactly the same manner if perfectly general terms in 
these expressions were considered. 

From (5), suppose that 2m (even). Then 


(mi 


Zm 


^ (- 1 ) fern)! ■ t ^ 

£l5l k 


The leading term in , ’-e- the first term in , multiplied 
by a polynomial in , is obtained from (17), by setting j =• 2.m . 

b.l Ul k \l j 



GEORGE MIDDLETON BROWN 


299 


Therefore the polynomial coefficient of i.e. the leading co- 

efficient in is 
/Om 


(18) 


Z’ln k 


s z: 

k=l i.=i 


■ q 



2m 


Again, from (5), when n= 2m+ i (odd), 


L 


'Zmi 


Zmi k 

«i: z: 

k4 i«l 


2(2m+l)-j.[2|.^2ni+l)j!f2m+l-j]! 


The second term in obtained by setting j - Zm . It is 

2m+l k /i<\ , , , 

K--1 L=1 k \ L / ' 


,2m-i 


Therefore, the polynomial coefficient of a ' • b i-e. the second 
coefficient in is 


(18') 


k=i Ul *< ' i- / 


Comparing (18) and (18'), which must be identically equal, if our 
theorem is true, it remains to show that 

Zm+i Itn+l .f+t /?m+l ) 2m 

s c r-1 i =0. 

Zm+l Ui ^ ^ \ L / 

but it is well known that this expression is identically zero * 


2 See Hall and Knight. Higher Algebra, p 259, Ex. 2. 



300 ON SAMPLING FROM COMPOUND POPULATIONS 


Section 2, A table of values of a certain class of poly- 
nomials in one mriahle^ for different valurCs of the argument. 

In order to facilitate the actual computation of the values for 
the semi-invariants , given in Section 1, in a particular appli- 
cation of the theory, when a* , and b- , are 

known, I consider the expressions for the L^, as they appear in 
the form indicated in (6). Now, when b = i,e. the two 

components have identical standard deviations, the set of relations 
(6) take the form 


(19) i!^ = ni^+aq- 1^^ =c?/4-a^pq 'and L!^=.a"pqq ; n>Z 


in which o . q^, etc., have the same significance as in (7), Mak- 
ing use of the properties of the expressions (6), which were stated 
at the end of Section l/from (19) we may write the L^, for 
n > 4 , as follows 

^5 " *75 (a} *~4 (a) 




N 




1 ;, +Z1 (^)L;,4io5(gL:,+io5(^)i:^ . 


6“ ‘-fe'*' 


( 20 ) 


N, \ 


. I 


l-a“ LjQ 


28(1) p . Z10(|) 4 105(i)' 


8~ ^6 


.4- 

l!. 




, I 


^ +• 
and so on. 


36'(t)L'8t378(^)p4lZ60(|)V945(5) 


Therefore, for n >4- , the general semi-invariants of (6) 



GEORGE MIDDLETON BROWN 


301 


may be expressed in terms of the special semi-invariants , 
obtained from the former by setting b = 0 . From (20) in general 
we have 


Ln = z: 
k=o 




n~k 


(ZU) 
n 

2*" -K! 



in which 


,f rr-k 

^ n-k “ ^ n-k > 


because 


L. 



k^o 


n-k 

^ P^^rt-k 



k=^0 


n-2k . k 
a • b 


■ Z^-k\ ia/ 

Hn-k ' 


and the last expression, for n >4, is the equivalent of the general 
expression (5) for the 

Further, if Nve consider the terms in the expressions (20) as 
elements in a set of diagonal arrays, as I have indicated, it is evi- 
dent that, moving downwards along any particular diagonal, any 
term in this diagonal is obtained ^rom the on^ immediately pre- 
ceding it, by the use of a multiplier formula for the 

calculation of the , may be derived as follows. 

Consider any term of say » whose numerical coefficient is 
Cet this be the(k+l)5t term. 

Then 

^k+l,n ‘ £kj^| 



ovU ON SAMPLING PKOM COMPOUND POPULATIONS 


Similarly, take the (k+2)nditerrn of L , with numerical coeffi- 
cient C ^ . Then 

k-t-;2.,n+l 

K+£,nfl 

and we note that C, , and C, - .are tlie numerical co- 
k+1, n K+Z^n+i 

efficients of two adjacent terms in one of the diagonal arrays men- 
tioned above. Therefore 




fn-f-l) 


Czk+Z) 




C 

k-f Zj n+1 


(■n+i)(n-21<.) 

2(k+i; 


C 

k+l^n 


niZk 


and 


M 


k+i,n 


(n+i)(n-Z.k.) 


It is of considerable interest to note, that the L!^ of (19) (for 
0 >Z ), are, except perhaps for sign, the product of the semi-in- 
variants ?[^ (n>l) of the binomial distribution and appropriate 
powers of “a”. To show this, we need only consider the generating 


at 


function for the , namely i+q(G -l), and that for the , viz 
r 3 

[l4-p(e ~l)] , with 6«i . Frisch^ has obtained a recursion 

d . 


formula for the A , which is 




dp 


'n-1 


n>l 


I Sur les semi-invariants et moments employes dans Tetude des di'^trl' 
butions statistiques. Ragnar Frisch. Skrifter utgitt av Det Norske Vidcn 
skaps-Akademi i Oslo. 1926. No. 3, Ch. 2, p. 29 



GUORGB MIDDLETON BROWN 


303 


Therefore, it is evident that the obey a corresponding lecvir- 
sion formula 

i’ ^ ^ ‘ 

^ dq 

In fact, the polynomials q^=pq, q‘^= pqq , q|^=pqq , etc., are the 
same functions of as the Ap are functions of p, i e. 




'^’n-^nCq)- 


for ni Z. 


To investigate thoroughly the properties of the polynomiak- 
etc , would be irrelevant to the problem in hand, hut, 
so far as I know, such a study has not been earned out. I will, 
however, mention a few of these properties, which appear inter- 
esting. 

1. The roots of the polynomial (for any n ) are all real 
and distinct, and these roots all lie in the interval (0, 1), 'iero 
and unity being roots of every polynomiak 

2. The roots of separate the roots of 

3. The polynomials , of even degree in q , are symmetrical 

with respect ro the line * whilst those of odd degree in q . 

namely n* , , are symmetrical with respect to the point 

(/a,0) 

4. An orthogonality property in (0, 1) holds if m^n, and m + D 


is odd. That is 

.i 


but 


q' q' =.0, 


q q 


rn^^n, m+n(odd) 
nTtf^n, rn+n(^ven) 


The‘?e same polynomials appear as functions of p m a paper by H 
C. Carver on “Fundamentals of the theory of Sampling “ Amcr Statist 
A^soc, Annak of Math. Statistics. Vol I, No. 1, Feb. 1930, p. 106. 



304 ON SAMPLING FROM COMPOUND POPULATIONS 


5. Further 


n-1. 




0 

in which is the Bernoulli number of order 2n . 

Zn ^ 

0 

In calculating the actual values of the expressed in the 
form (20), it would obviously be very convenient to have at one's 

disposal a table of values of the polynomials 

=pc^4^» etc,, for a range of values of q, since the latter, when 

multiplied by appropriate powers of 'V, are the of (19) and 

(20). I have, therefore, set up such tables, for values of the varia- 
ble q ranging from 01 to 10 inclusive, at intervals of .01 . 

It is to be observed that only functional values are recorded 

here for . 01 4 , 50 , since we would merely repeat these values 

when 50^ q g 1 0 » in the case of the polynomials o , of even 

degree, whilst in the case of those of odd degree, namely » 

there would merely be a change of sign. For it is easily seen that, 
writing, = A ^ (q) , 

Cp4q)=i. 


Hence 


and 






I have calculated the exact functional values of all the poly- 
nomials , for 2, S n and these values appear in the tables 



01 OUCL Mlbi)lfr()i\r lij.'OiyM 


oOS 


for those cases in which n 1 4, but for n> 4 the functional 
values are written down coriect only to eight decimal places Kach 
polynomial is set out in detail below . 

q'^ =cl'7q^+i^q^-6q,^. 

= q-15q^50q^60q^+£4q^. 
q'^ - q-31q^+ iSOq^- 390q^+ 360q®- iZOq^. 

^ q-63q^+ 602q^-2i00q'^+336Oq^-£5ZOq*= + 720q^ 

q'g « q-127qM932q"-iO£06q^+25200q5_ 3l920q^ 

+ 20l60q^-504.0q® 

= q- 255q^+605Oq^-466Z0q'''4-166S24q^ 

- 3175 20 q^+ 332640q7- i0144Oq®+ 403?0q9 



306 ON SAMPLING FROM COMPOUND POPULATIONS 




^3 

^4 

^5 

.01 

0099 

.0097 02 

.0093 1194 

0085 4940 

,02 

0196 

,01H8 16 

.0172 9504 

.0143 9048 

.03 

.0291 

0273 54 

0240 1914 

.0178 0198 

.04 

.0384 

0353 28 

.0295 5264 

.0190 4886 

.05 

.0475 

0427 50 

0339 6250 

.0183 8250 

.06 

.0564 

0496 32 

0373 1424 

0160 4106 

.07 

.0651 

.0559 86 

0.396 7194 

.0122 4974 

,08 

.0736 

0618 24 

0410 9824 

.0072 2104 

.09 

,0819 

0671 58 

,0416 5434 

.0011 5512 

.10 

.0900 

.0720 00 

0414 0000 

- .0057 6000 

.11 

.0979 

.0763 62 

040,3 9354 

- 0133 4808 

.12 

1056 

.0802 56 

0386 9184 

0214 4440 

.13 

1131 

0836 94 

0363 5034 

-.0298 9550 

.14 

.1204 

0866 88 

.0,334 2304 

-.0385 5882 

.15 

.1275 

0892 50 

0299 6250 

- .0473 0250 

.16 

1344 

.0913 92 

0260 1984 

- .0560 0502 

.17 

1411 

.0931 26 

.0216 4474 

-.0645 5494 

.18 

.1476 

0944 64 

0168 8544 

-.0728 5064 

.19 

.1539 

0954 18 

0117 8874 

- .0807 9996 


,1600 

.0960 00 

.0064 0000 

- .0883 2000 

.21 

.1659 

.0962 22 

0007 6314 

“ .0953 3676 

.22 

1716 

,0960 96 

-.0050 7936 

-.1017 8488 

.23 

.1771 

.0956 34 

-.0110 8646 

-.1076 0738 

24 

1824- 

.0948 48 

-.0172 1856 

-,1127 5530 

.25 

.1875 

0937 50 

-.0234 3750 

-.1171 8750 

.26 

1924 

0923 52 

-.0297 0656 

-.1208 7030 

.27 

,1971 

.0906 66 

-.0359 9046 

-.1237 7722 


2016 

,0887 04 

-.0422 5536 

-.1258 8872 

.29 

,2059 

0864 78 

-.0484 6886 

-.1271 9184 


.2100 

.0840 00 

-.0546 0000 

-.1276 8000 

• 31 

2139 

0812 82 

- 0606 1926 

-.1273 5264 

.32 

.2176 

.0783 36 

- 0664 9856 

- 1262 1496 

.33 

.2211 

.0751 74 

- 0722 1126 

-.1242 7766 

.34 

,2244 

.0718 08 

-.0777 3216 

- 1215 5658 

.35 

2275 

.0682 SO 

-.0830 3750 

- 1180 7250 

.36 

2304 

0645 12 

-.0881 0496 

-.1138 5078 

.37 

2331 

.0606 06 

-.0929 1366 

-.1089 2110 

38 

.2356 

.0565 44 

- .0974 4416 

- .1033 1720 

.39 

,2379 

0523 38 

- 1016 7846 

-.0970 7652 

.40 

.2400 

0480 00 

-.1056 0000 

- .0902 4000 

.41 

.2419 

.0435 42 

- 1091 9366 

-.0828 5172 

.42 

.2436 

0389 76 

-.1124 4576 

- .0749 5864 

,43 

.2451 

0343 14 

- 1153 4406 

-.0666 1034 

.44 

.2464 

0295 68 

- 1178 7776 

- .0578 5866 

.45 

.2475 

.0247 30 

-.1200 3750 

- .0487 5750 

.46 

2484 

.0198 72 

-.1218 1536 

- 0393 6246 

.47 

.2491 

.0149 46 

- 1232 0486 

- 0297 3058 

.48 

,2496 

,0099 84 

- 1242 0096 

- .0199 2008 

.49 

.2499 

.0049 98 

- ,1248 0006 

- ,0099 9000 



.0000 00 

-.1250 0000 

.0000 0000 


308 ON SAMPLING FROM COMPOUND POPULATIONS 


Section 3. Approxinuitc expressions fo) the scmiM7ivar%ants 
of in samples from the compound frequency 

function. 

In the paper of C. C. Craig, aheacly cited, the author obtained 
the following results for sampling from a single parent population. 

(1) Expressions' for the sampling characteristics of the correla*- 

tion functions for , and , in terms of N , the 

size of the sample, and the characteristics of the population 
Itself. 

( 2 ) Expressions'^ for the sampling characteristics of the distribu- 
tion functions for , and ni terms of certain 

functions, the latter being defined by 


( 21 ) 




(sj sj )= 

^ ^ krn + -En 

n / Z — ^ 


in which (\/^, ) are the characteristics of the corre- 
lation function for . 

I can now make use of the results indicated above, in conjunc- 
tion with the relations (6) of the present paper, in order to deter- 
mine approximate expressions for the semi-invariants of <^3 , , 

, in samples from the compound frequency function, retaining 
only terms of order ~2 and higher in N in using expressions (1), 
and only those ‘‘g” functions in using the expressions (2) which 
are of order ^ 2 and higher in N , where is of order — 

A. The semi-invariants of , vix. h , b_ , b, , etc. 

^ 1 4 o 

From definition ( 21 ) and the relations ( 1 ) for m=Z, , 
and making use of the following notation 


1 Loc. Cit p. 57 ct seq. 

2 Loc, Cit. p. 50 et seq, 



cr.Oh’Gi: Minnuj ON bnown 


(22) Cp.: 





3fN 


( W' 


in which the lire the same as in (6) of Section 1, 1 olnain the 
following set nf “'g'* functions 

%1 =^[C^-i-)0^-2)cP3j 

=^^[(N-i)'^N^z)4)5+6rifN-iXN z)(P^j 

( 23 ) 

9bz {fN-l)^(N-2X4)^.9N(N-l)(N-2X4)^ + 
9N(N-l)(M-2)^Cp3^6H^(N-l)CN-2)]. 
^(Na)^4)^+12N(Na)^4)4.+^I^rN4)(fN-2;4^3+8N^('N-^ 

+i2N(N-i)(N-2X2r>|-3)4)^3+^SNYN-l)(N-2)4|. 

[cN-l)"fN-2f4>a^2iM(''N-l)"(N-2)^cf, 

+ 6 N (:N'i)('N-2)^fen4l)(P53+ 9 N (N-l) (lN-2f (3IN-5)(1)^^ 
+i8I^XrN4)fN-z)C5Nai)fp4+48N^(N4)Crt-2X9r>l-20}4)3" 

+36nViN-1)('N-2)| 




310 ON SAMPLING PROM COMPOUND POPULATIONS 

^rM-l}^N-z)^<P^4-2Tr(rH-i)^fN-Z3^4V ^27N (H-i)(N-2)Y3N-4.)({, 

+ 27lN(r(-l)(lS-2)Y^^('7j(P5^+ 54- N N-1 YN-Z) Y4-r/' j) Cp^ 

+16Z N YN-lXN'2)Y5N-12)(j)^3 +36N 7 N -30IV-^34}(g^ 

+ioafN YN-iXh(-2)C5N-i2)4)3 1 . 


and on substituting these values for the in the expressions 
for the semi-invariants of 0(3 from (he relations (2), I get: — 




GEORGE MIDDLETON BROWN 311 

-513 cfe - 8104)^^- 1 4), -loe^j-f^j wiol)/ 


9 1 ) 891^21 61^ 27^ 99^12 

-2<F38T%5'-7%6+£^a^b" YU UU 


v° 

In a similar manner I obtain 
R. The semi-invariants of oC^, viz , etc. 

In this case m=*2, n=:4 , and the ‘‘g” functions tliis tune are 


.^^o=^,[N(N-z}<P,.2NrN-l)} 

^,^=77z[(No;(i),.Z(7N-25)cj)^.6(N-5}cp;+lZ(N-2)] 

■,lZflTN-dd)(P^+7Zf3N-Z0)(l>/+24(4!i-i3)j 

Zb4l+ 40Cp^j-v34(pVlT64)^-v 1444)3^72| 



312 O.V SAMPLING bKOM COMPOl’ND POPULAl IONS 


+Z0i6 mz 1584 (p^+ Z85Z (P“+ 768 

+T844cp^j^+137T6cP2^^+7848cP^5'\ dZ80(P^l\253A4 
+567£cP^+1144a^|^^T3440cp^j^^+51046(P^“+il9Z32^p^^^^^^ 

+ 1Z45 6 4)^^49248 110592 ^P/j^ 9504(i)^^ j 

Therefore, substituting from (24) into the expressions for 

the semi-invariants of given bv the relations (2) we have — 

A- 

+ (-174-a(p,-9^;-169^P^-48<P3')4(lZfe9-9c()^ 

-9Z4)e+4-39<p4+15^p4'-4ZO(p^^-3604)35+272{p^^^ '- 

j 

(i28.64(Pj.^, (-1660+72 d),-462 
.2oc^|(-384-224>,-2O(p^*‘-4O0(P^-132(p/)+i,(5664-35<p^ 
-202^5+26OO(^^60O0(P^-12OOc|),^- 14004)^^ +1464- 
+ 240(P^^.664)()]+|i (l320+7(Pg..2444)^+494(l)^^ 4fP+Z376(P^ 
+1500 4) + 336(|5+16(|^+96<gf )+ ^ (- 3 Z208 - 140 4)^ - 6544 (p^ 
-244O4)_^’-37i9Z(P/-7500O(P^-461684)3^-7O56cp^5 + 704 cp^g 

^a964>J^-53^(p^+1855(P^^95^5(|^+32256cPJJ^+24192^PT 

-384(p^Hl64)^+45312(P^JV668164)/j+lZ2S8ip/-36(P^g-1840c|^5 



313 


(25) 


CnORC^n MIl^DLETON BROJVN 

-80(j)/-2880c?^^+ il920(|'^l9604)^'-a4(p/J- 504 
=34^ 1^4 f™+64(j)^-648cP^'- 18244),. 256cj)/) 
+34^^(7498-20cPg-16(|.46O8(p/4 105924)^+ 288(^3 
-IIZO (j)33+360<j),^.2176(i|^^j.3=4^ (^-28096 + -88 (Pg 
-55444)^- ZI51 4)^^-38776(^,^-67612 (g,-3T72ac|^-3526 cP^^ 
+5284)^3+672 4 ) 3 ^ - 2008c^+ 1392 ^5^9216(1+ 7152 
+24192p^35+16144P^^^-10800p^+33984 501 1 2 

-4416 P 33 + 71524)/^'-48P^-2368P^^5-96<j);-11764)/^' 
-60p,"^'-480P^')+(il4912-5(^-18c^-330c?^-6P^^-l040P^^ 

-1 532p^-3640<P^^-168<|)/-7380(|^-36720(p^3^-68880 

-39240 (p“-122404^'-725 76(P^5+ 23120(^,^-57240 
-367200 255240 c|“-596160 ^^'^^35568^^^-246240 (gj 

-55295OPj3^-475Z0p^‘’-79Zp^g-l00acP^^+47O64({^-2088(f5' 
-10728(p^f-36288<p^^^-272164)2^+330480(^'-5O976P^' 

- 75168(p^3 '158Z4(p/- 264 4)^- 336^^^.3048 696 

-35764)^^-i2096(P^^3-9O72P^^174Z4p“-16992P” 
-25056p/^'-4608 ^^^'+72 4)^g+4324)3g+10084)g+38l6O(P, 

.3456cp35+Z0736p/W384(P35.305496(p,^+862304p^ 

+ Z77344p3^24-(e,g+li52<^'+42p^^l5i2p3^V504(|^'j8lb4)^)]. 



314 OM SAMPLING PROM COMPOUND POPULATIONS 
C. The semi-invaiiants of , viz. clj_ , , dj , etc. 

Now and ^-p f’-™’’ the i elation (21 ) 

Tlicrcfore. the functions here arc — 

"Jr ('fi 

( 26 ) ^ ' 


On substituting from (26) into the expiossions foi the .semi- 
invariants of cT^ from the relations (2) gnes — 



= 

Scctw)! 'L The case jn n^hich the compound freqtioux fjinc- 
iion may possess fion-j^em semThizKiriaJUs of ail orders. 

Instead of the components of the compound fieqiiency func- 
tion being considered normal, I now assume that they may possess 
noivzero semi-invanants as far as the third order, and L again 
derive expressions for the paiameters of the compound in terms 
of the parameters of the components, ^Phe method of derivation 
IS entirely analogous to that in v^ection 1, where the components 
were noimal. The exjii essions for the , the semi-iii variants of 
the compound aie seen to be more comiilicatecl in the jircscMit case. 




GEORGE MIDDLETON BROWN 


315 


but this complexity is more apparent than real In fact, I have 
succeeded in deducing a rather simple general law, by means of 
which, these expressions may be written out, and this law is still 
applicable, even if the two components should possess non-zero 
semi-invariants of higher orders than the third, 

I now write 


(27) e 


= pe 


, zt’ 


1^3^' 



V* 51 


where 




1 1 






-A- 


A. 


r = 


a 

p ' 


and m , m , cf , d , have the same significance as in Section 1. 

Z i- N* 

and , X , are the third semi-invariants of (()^(;c) respec- 

t 3 Z 3 

tively, and as before p+q-1 • Taking logarithms in (27), we 

have 




L|t+L^-^ + L33!+-" 


(28) 




further 


(29) 


1+re 


K=1 ^ ^ ^ 



316 ON SAMPLING FROM COMPOUND POPULATIONS 


in which 


= at + 


bt^ ct^ 
V. 5! 


The right member of (29), may be put into the form 
<><5 K iJ©'' ] 

hr 

k=:i U1 k J ^ 


t 

Therefore equating corresponding coefficients of in the right 
and left members of (28), for n >3 , gives 


(30) 


n 

2: L 

Ml Ui 




l^.nl 






where the last summation is taken over all values of j , such that 
the following diophantine equations are satisfied. 


(31) 


-j- 

o^,+-2f3 + 3^xn. 


Using (30), I obtain, for n * 1 to 8 inclusive, the first eight 
semi-invariants of the compound as follows, 


L^.m^-nacj, . 





GEORGE MIDDLETON BROWN 


317 


'.32) = a^qUi5a%q'5+4.5a^b^q‘^+-15 bq^+ZOacq^ 

+60abcqj4- lOc^q’^. 

L ^ £1 bqU 105a^ b^q‘^ + 105 ab^q'^ 

+ 35 cL*’cq5 + ZiOa^ ^‘^'^4.'^ 70ac^q‘^ + i05 

'“6 = a.^Hg-^28a^bq!^ + £10a^l:?q'^4-4'20a^bV5 
+ 105 +56a®cq^+ 5bOa^ bcq^ 

+ ZaOaVqUS40ab*cqJ^+230 bc=^q^ . 

in which 2q^=pq, q^=pqq^, etc,, and are in fact the same 

polynomials that occurred in the discussion of the case for normal 
components in Section L 

The expression for the L,^ in (5) may be put into a form 
similar that that of (30), and then, if we compare these two forms, 
it is obvious that no new polynomials will occur, in addition to 
those which appeared in (5), and this would be true however many 
non-zero semi-invariants the two components may have. 

Using the results established for the in Section 1, it is 
evident that, if we consider a particular semi-invariant, say » 
of (S), the terms in the right member can be readily written down, 
if we determine all the j part partitions of n, where j and n are 
fixed, using the integers 1 and 2 as part magnitudes. Suppose we 
havco^ parts, each equal to I, and ^ ]iarts» each equal to 2, where 
4 -viS) =* ] and og-\-20-n , then such a partition corresponds to a 
term of the type (omitting the numerical coefficient), 

the factor arising from the last summation of (5). which 



318 ON SAMPLING PROM COMPOUND POPULATIONS 


is clearly seen, if the latter be put into the same form as (30). In 
addition, it is to be noted that any j part partition of n will ,be 
unique for this case. 

Now if we consider the case of the present Section, for n>3 
the of (30) will be seen to contain, as well as the same terms 
of the corresponding of (5), some additional terms, the latter 
appearing on account of the fact that, since the integer 3 is now 
admitted along with 1 and 2 as a part magnitude, the j part parti- 
tions of n will no longer be unique for every possible value of j . 
These j part partitions of n will, moreover, give rise to terms of 

the tyi^e b^C where the relations (31) are satisfied. Fur- 


ther, the total number of terms in a given L^, for n not too large, 
can be readily obtained by making use of the so-called ^‘enumerat- 
ing function,” discussed in works on combinatorial analysis, which 
enables one to determine the number of partitions of a given integer 
n * when the number of parts j , and the part magnitudes 1,2, 3, etc 
are fixed. 

It would appear, from the above discussion that the partition 
method of obtaining the terms of could be carried over to the 
most general case, in which the components may possess non -zero 
semi-invariants of all orders. 

I shall now indicate, without going into detail, that, if in the 


expressions (30) for n > 3 , I set every equal to unity, then 
the became functions of a, b. and c only, which I call ^6^ , and 
the latter obey a recursion law analogous to the one established for 
the Pg of (10), namely 


(33) 


where now 


(^n = 


dt*^ 



2 ' 3 ! 


bv definition. 


t=0 



GEORGE MIDDLETON BROWN 
Putting + + 


319 


where (aj, y^ z) is an n-^h degree polynomial in x, y, and z. 
Again, it is readily seen that 




dt n 


(c, b+ci,a+bt+~) 


(35) 


t=0 




P^(c, b+ct, cL+bi+ — e 


t=0 


and that 




t=0 


a-P^-e 


cP(t) 


t-0' 


Now deriving the left member of (34), with respect tot, and then 

setting t -0 gives the next p , viz, » whilst the derivative of 

the right member of (34), then setting , would equal the 
sum of the right members of (35) and (36), and thus the law in 
this case is established. 

It is at once apparent that the recursion formulae of (13) 
and (33) may be generalized, so that, if the two components should 
possess non-zero semi-invariants of all orders, the law for the pp 
would then be 



320 ON SAMPLING PROM COMPOUND POPULATIONS 


a, b, c, d, etc. being the differences between the 1st, 2nd, etc. semi- 
invariants of the two components, respectively. Thus it appears 
that the actual writing down of the expressions for the parameters 
of a compound frequency function in terms of the parameters 
of its two components may be reduced to a partititon process, and a 
taking of derivatives. 

Section 5, The limiting compound frequency function^ when 
the number of components is allotved to become indefinitely large. 

It is to be noted that if the compound is assumed to be com- 
posed of a greater number of components than two, then the 
mathematical development beoomes heavy, but a rather interesting 
case arises, when we consider the form of the limiting compound 
frequency function, when its components, infinite in number, and 
identical in form, each contribute the same proportion to the total 
frequency of the compound, and have their means distributed ac- 
cording to the known frequency law f (x). 

First of all I consider the compound to be composed of a finite 
number of components, say Mfi , of the type indicated, and later 
pass to the limit, allowing M to become indefinitely large. 

I write now 


( 3 ;) 




i • 

in which pj^ - for all L = 1, 2, 3, etc, and 

etc, are the 1st, 2nd, etc. semi-invariants respectively of the L-th 
component, The right member of (37) may be written 



GEORGE MIDDLETON BROWN 


321 


in which rn^is the mean some component and m is the mean 
of the ( L -el) 5t component. 

If we now assume that m^= , then 

“t ^ 


(38) 


= e 


t2 ^ 

+^13 5 !" 


1 

M^i 


M+1 

i+S e 

t=^ 


m^t 


and the right member of the last relation is the generating function 
for the moments of the compound frequency function. Allowing 
M to become indefinitely large^ we have 


■Bim 


A- 

M+l 



m-i't 


9 ^ tn ah 

+ e ^ 




f-e ■f(x)dM =!Cr^(t), 

•I- (to 



Therefore the limit of the generating function (the right member 
of (38) is given by 






so that the semi-invariants 
function are given by 

g Lit tL^^i+ 1-331 


of the limiting compound frequency 


■fi l! 

= e • Gv(t) . 



322 Oi^ SAMPLING PROM COMPOUND POPULATIONS 


From this last relation, it will be seen that the mean of the limiting 
compound is equal to the mean of the means of the components* 
Further, if the means of the components are normally distributed, 
and the components themselves are normal, then the limiting com- 
pound frequency function is also normal. More generally, if the 
components are normal, and their means follow any frequency 
law then the limiting compound function also follows this 
same law. If now, considering the most general case of all, where 
the components may have non-zero semi-invariants of all orders, 
and the means of the components are distributed according to the 
frequency law then the semi-invariants of the limiting com- 
pound frequency function may always be calculated, and will be 
given by 

in which L. , , arc the k-ih semi-invariants of the limiting 

K IK K 

compound function, of one of the components, and of F(x;)respec- 
tively. 

This shows that the variate z of the limiting compound fre- 
queifcy function, is distributed as if it were the sum of two inde- 
pendent variates, one of which is distributed according to the law 
of the means, and the other according to one of the components. 
To write down the actual distribution function for the limiting 
compound is quite another matter, but since we may write, in the 
limit, when 




iZZ! '^13 3r 




then, the distribution function sought, provided it fulfills the neces- 
sary conditions, may be given formally by means of the Fourier 
Integral Theorem, 



GEORGE MIDDLETON BROWN 


323 


PART 2. 

As indicated previously, in the introduction to this paper, we 
are concerned in this second part with an entirely new problem, 
in which we are now sampling from two distinct parent populations 
instead of from only one, as in Part 1, Hence, in order to obtain 
'the desired sampling results, we must have recourse to an entirely 
different method of treatment from any we have made use of here- 
tofore. I shall suppose that the two parent popula- 

tions may possess non-zero semi-invariants of all orders, and that a 
random sample of r is taken from the first population, and a ran- 
dom sample of s from the second population, these two samples 
being then combined to give the composite sample from the com- 
bined populations. 

Section 6. The semi-invariants of ^'moments about a fixed 
point/' in samples from the compound frequency function. 

With the above hypotheses, I shall derive in this section, expres- 
sions for the semi-invariants of '^moments about a fixed point” in 
samples from the compound population. 

Calling the required semi-invariants 3^ » ^2, ^ ^3 ‘ 
have, by definition 

e 


(39) 




■J 


^ n il+S n 
1-1 J=^r+1 




in which cP^(x) , <^^00 are the initial parent frequency functions, 
and >; • , xi , indicates that the variate was taken from the first 
and second parent respectively. By a suitable transformation of the 



324 ON SAMPLING FROM COMPOUND POPULATIONS 

parameter in the power of the exponential which appears in the 
right member of (39), this same member may lx? put into the re- 
quired form 



On Equating corresi.x)nding coefficients of in the last ex- 
pression and the left member of (39), I get 


(r + 5)K 


in which 3^ (V^), 5^ (n/J ), S, ) , etc., and 5^ (\/'), 


S 3 ) y etc., are the 1 st, 2 nd, 3 rd, etc., semi-invariants for 
in samples from the two component populations andC^O^), 

respectively, the values of which are well known. ^ 


Section 7. The semi-invariunts of moments about the mean** 
in samples from the compound frequency function. 

Employing the same sampling procedure as in the last section, 
I wish now to consider the semi-invanants of ^'moments in sam- 
ples from the combined population, about the mean of the com- 
bined sample'’ and to express them in terms of , 06 ^^, , and , 
, (the semi-invariants if the component distributions 
(^(> 2 ) respectively), and r and 3 . 

In order to obtain the desired results, I have made use of a 


» Loc. Cit. pp. 12-13, 



GEORGE MIDDLETON BROWN 


325 


modification and extension of a method originally employed by 
C. C. Craig^ for the case of sampling from one normal parent i>op- 
ulation. I shall first develop the theory for my case, on the basis 
that the two parent populations may possess non-zero semi-invari- 
ants of all orders, Imposing the condition of normality only when 
actually computing the desired results. The mean of the combined 
sample is 



which 6 r - and r4-s-N , for particular values of n, 

i L j. 

and for infinitely many sets of r-i-s variates, assuming that each 
member of each set is independent of all the rest. TheN ^&sin each 

N 

set satisfy Zl 6 j^ = O • 

Now, let F(6 6 ■'•6 )be the correlation function of the first 
N-i 6'3. ThenF( 6 ^, 6 j • 6 ^.i)cL 6 ^cL 6 j^^ gives the probability 
that the first N-1 6'5 fall simultaneously within a cell 

(6^±icie^, ^Vi) 

The semi-invariants of F ( 6 ^, 6^- ■■ 6 ^_ 3 )are defined by 



^ Loc. Cit, pp. 1 to 35. 



326 ON SAMPLING FROM COMPOUND POPULATIONS 
where e.g. 


/ 2 1 >, 


Setting 


we have 


N 


6 .- «Z 1 a,; x;: 
i-l ‘■J j 


a 

a 


i • . . , 

. - ±1 

11“ N 



G» 



327 


(.nORC.'E MIDDLETON nT’iOWN 

HI which , etc , are the 1st. 2n(l, etc. .semi-inwinants of the 

first componen. (P^(x). and etc, are the correspondins- 

scnn-invariants for the second component CP„()c), It follows then 
that 






N-1 


N-1 


N-l 




or 

A,. 


\,L=1 

k, 




N-l I 
N-l^rj 


.,^1 




^*^i,r+S^Z, r+s' ‘As-ljC+s 


where k^+ k^+ • • • + » k 

so that 

A -J, [£aV* 


p+s 









328 ON SAMPLING FROM COMPOUND POPULATIONS 


By substituting for the a’s from the relation (41) in the last equa- 
tion, the latter may be reduced to the following convenient form, 

where, now, , provided that, 

at least one of the in 


£ (-if 

Ui 



(42) 





In the case of one parent population, all /). . ^ of the 

same type, i.e. whose subscripts are merely diflferent 

permutations of the same set of integers , were 

equal This is no longer true for two parent populations, for we 
must now distinguish between the 65 which arise from observa- 
tions from the population and those from ^ therefore 

introduce, at this point, what I shall call the *'bar notation'l For 
example, from the relation (42), all jk 

of the same type, and therefore equal, if the first r subscripts are 
merely different permutations of the same set of integers kp 

whilst, quite apart from the first r subscripts, the last 3 subscripts 
are also different permutations of the same set of integers kp^^, 
♦ In writing down the semi-invariants of the cor- 
relation function F , using the '"bar notation”, for convenience, I 


shall suppress zero subscripts, Further, on account of the relation 
(42), all which 

will vanish, since we are assuming now, that our two parent popula- 


tions are normal As a matter of fact, the only 
that I shall require here are the following, 





GEORGE MIDDLETON BROWN 


329 


\|0 “ N { ^ ■ 

\\i I ^ ■ 

^zio 

^0|2 “ N5.|r<<,2 +s(3;i + N(N-£)(3;j . 
\i(0 

Vl ^sC>z-Z^^z . 


The above exprebsions were obtained from (42), after assuming, 
(without loss of generality,) that Pct^ +s(3 =».0 . If, instead of 

1 3 - 

this last assumption, I had assumed that or were equal 

to zero, not only would the symmetry of the final results have been 
destroyed, but the amount of labour necessary to obtain them would 
also have been doubled. The symmetric substitution actually made, 
required that only half the final number of terms he obtained, the 
remaining half in any particular result being readily written down 
by interchanging the oO’s and as well as r and s . m 

Now, let P(y[)be the probability function for C -j- 



33{j o.v bROM compound populations 

The semi-invariants ot 7^(v^) arc then defined by 



Regarding the use of • 6^^) instead of 

in the above relation, see paper hy C* C. Craigd 

We wish now to express the semi -in variants .3^ in terms of 

the semi-iimriants u > of the correlation function 

F (6. ,6«‘ • • 5,,) for the 6’s. The semi-invaiiants L 4 
‘1’^ rsu,.» 

of the correlation function for ^re defined by 



by expansion of the exponential function. Then, comparing the 
1 Loc, Cit. pp, 18 to 19 



GEORGE MIDDLETON BROWN 
relations m (43) and (44). it is readily seen that 


331 


(45) 5,^ = 


kl 


. L. 


in which the summation is taken over all values of k k k 

V Z’ ' ‘ 

such that 

Making use of the explicit relations for semi-invanants in terms 
of moments and vice versa, we have from (44) and (40) 




(46) -EE. .E 


(aOkbOkc!)*---- 

r! s' il 


(47) 


/ 

( ^ 

\ l=i ^ 


(n) 


-zz: .x 


(a!)''(bl)Hcl)t . nsitl 


where, in both these relations 


a > b > c ^ 


and 


ar bs + ct + ‘ ^r^ 


From the relations (46) and (47), the Lp can be found in terms 
of the moments of F , and these, in turn, can be found in terms 



332 OK SAMPLINC PROM COMPOUND POPULA'IJOKS 


of the semi-invariants of F , by equating the coefficients of like 
powers of the 1 3 on both sides of the two ecjuations. Examples 
of the kind of relations obtained from (46) and (47) in particular 
cases would be as follows, 



Therefore 


L =\/ -v' .\/ 

210 ... O 4-20. O AO...O 020 . o 


'^'^20.. 0 ^ 20 . o'ozo . 0 . 








(t) 




+ iOj( EA'.it 


N 


Therefore 


i-ZO 0‘^^40 0*^0Z0 *0 

0 .0*^ ^^ZiO . ..O ^ 120 , . o 

. 0' ^ozo.^.o"*" ^^^20.. 0’ ^ 110.,, O • 


In m> work, I actually make use of the following relations obtained 
from (47), with certain terms omitted, which vanish when each of 
the parent distributions is normal. 



GEORGE MIDDLETON BROIVN 


333 


(iii) 


N \(4) f/N 


2 /H \( 2 )r/M 




r/N xWi^r/N 


(vi)(Ev/A; 


L«1 ^ >' 




N . \(7) 


,(i)i^r/N 




N \( 2 )r/N 




*^(5AV J 


N \( 2 )rr/'f^ 


N \&) 


(vi«)(EN(V J K.i'^i/J X 

7N 7M V2)7*r/N Nl^ f/M r/M \\6 

l{7^‘r“[(5i'‘*‘) J IfeAww) m'4 


By substituting the expressions for the L’s given in (46), into the 
right member of (45), a direct expression for the S^^'s in terms 
of the moments of T is obtained, viz. 



334 ON SAMPLING PROM COMPOUND POPULATIONS 


* K! 

3 »— CL|. 1^ k i" 1 7"i" Ti 

12. r 




N 

Z L: 


(K) 


omitting the parameters , and in 



(46). From relation (43) to this point, the theory follows exactly 
that given in the paper already cited. 

I shall next quote the final expressions A for some of the 3^(v^) 
obtained by C. C. Craig, ^ in terms of the moments of F , for the 
case of one parent population, and then I shall write clown the mod- 
ified expressions B , when two parent populations are involved. 


A. ■ 

B. 1. |'''^nlo 



r\/ 


b 




..2,rs>/ 


n 






nlo' '^o\’nl' 


In the paper mentioned above, expressions were also derived, using 
a method similar to the one already indicated, for the semi-invari- 
ants of the correlation function of two moments about the mean. 
I have made use of a modification of only one of these expressions 
as follows, 


' Loc. Cit. p. 22. 



^,EORGE MIDDLETON BROWN 


335 




®' ^ll^'^i/n )= r^gi r'^m+nlo+ ^^olm^n + nlo 


-s(3-l)v^l^^^ + r3s^|i^+r5v' 


n m 


'’''m|o'^nlo+ ^oln+'"^^mlo'4ln+^^^olm ^nlo 


[fere again, for the moments of the correlation function, I 
( ni|^>loy the '^bar nots^tion^’, its meaning being exactly the same as 
in the case of the '<rl'^r+l discussion regarding 

identical types of moments and their equality, corrtspondmg also 
in every detail. Once more, zero subscript.s are suppressed. 

It now becomes necessary to express the modified moments in 
the expressions B , for particular values of m and n, m terms of 

the iv 1 ly , of the correlation function F . 

' ‘^r 1 ^r+i • Kr + 5 

I'o this end, I make use of the relations (48), in conjunction with 
the so-called operator of Hammond”^ which splits off a total 
integral part s , made up by addition from any or all of a permu- 
tation of integers. 

At this point also, it is necessary to modify somewhat the use 

■'! the Dc operator, because of the “bar notation’* used to designate 
^ ««- 
I he moments and the semi-invariants of the correlation function r , 

A hen uvo parent populations are being considered In making up 

!hc total integral part 3, split off from the permutation of integers 


^ MacMahon — Combinatorial Analysis, Vol. I, p 27 



336 ON SAMPLING FROM COMPOUND POPULATIONS 
\ * 'V-vS ' 

integers kj^k^ * * kp must be kept distinct from those parts 

which are split off from the set of integers kp^.;^,.. and 

this same rule applies also to the residual permutations from each 
of these two sets, after all the parts, with sum 5 have been finally 
split off. Hence, the use of the ''bar notation” to effect this dis- 
tinction. To illustrate exactly what is meant here, suppose that I 

wish to express in terms of A '<r-hs‘ 

case, I shall use the relation (v) of the set of equations (48), and I 
shall merely consider the contribution made by the second term in 
the right member of (v) to the final expression for , the other 
terms of (v) being treated in a similar manner. Now V^j^, (omit- 
ting a numerical factor) is the coefficient of in the left mem- 
ber of (v), I therefore seek the corresponding coefficient of 
in the second term of the right member of (v) this term being 

N \(Z) 

• Using the modified form of theDg oper- 
ator, we have 

i>^cJ(3|z)-(zIo)e^(i|z)*(o1z)d*(31o)+ (lll)Df(2ll) 

.3(zloXlloXo|l)^(ol2)(l|o)+3(lll)(llo)to|l) 

Now, we are able to write down immediately the terms in 
'.ivlkr+i, .. .kr + 3 which arise. They are 




GEORGE MIDDLETON BROWN 337 


Ordinarily, the numerical coefficients in (49), will need to be mul- 
tiplied by an integral factor, obtained as follows* A term may 

/ ^ \ (5) 

be chosen from the expansion oi I 


general, in C^) ways. The numerical coefficient of the second term 
in the right member of (v) is also 10 (and in general, is, say ). 
The required factor for the above example is unity (or, in general, 


the quotient — • . It should be noticed in addition, that the sum 

of the coefficients in the final expression (49) should equal the 
numerical coefficient of the second term in the right member of (v) , 
with which we started, and this is seen to be the case. 

As a check, one may observe, that if in the results which I 


obtain for 5 |^(n^), (v^y^), the two normal parent populations 

are identified, then the results for a single normal parent population 
are obtained. Note that, to get the results as usually given for a 
single normal parent population, one would further have to set 


I derived the following results, which have been checked by 
calculating the corresponding results for a single normal parent 
population, without assuming that the first order semi-invariants 

of the type .... 00 


V 


3jv^)=^|3Nr|N(N-2Ar(2H'3) .«/^j^3NsjN(N-z)ls(2N-3)| 


A 



.m OA' ^SAMPLING PPAM LOMPOIINP POPULA'I IONS 


+ sp^)+6N^[r(N-ZN + r)o;^\4-3(N ZN + 

*A 

h [bNra (2N-3)]o6^ N^r s (5^] , 


s/v^)=^4jr(N-2Ntr)«.^+3(N^-2H-v3)p^^Zr5[(r^^(3^+54^j3i’') 
+=«.^P2+(soc[.^2.+ rpj|5j)- r«.^ Pi<3z)] • 

5^(Vj) = ^b |br[NW+NV's(r +5Ns)j«<.^^=<.^+&5[N'^N% 

-r(s‘^'i-5N r)jp| f>^+3N^rs r (3^ /3^)+(r*^Pj+ s-C^. &i ) 

-Z(s‘<'J'^z '■■^i (9i )]-t-3r3[4fs\4^ Pj^+ ) 

•i-6rs(»ii*;j P^Pj,)-2|5(N^-ZN-r)ot* 3^ + r(N^-ZN-s)c<,j^ | 

+ Z |[n ( 6- M)(r- 3 )+ Z r [N(b-N)(s-r)+23^] /3^ | 

+ {[r(ZO-lZN + 3N^)-N(2M^-iON'^-16)]<2^^Pj,4s(ZO-12N+3N^) 
-N('zN^-10N+ib)]^^/3^^j +r r=^(3N^-i2N+20)-Nr(6N-30N+48) 
+N^(5N^-24N+32)]=t/ +s [5^(3N%N+Z0)-N5(5N-30N+4S) 
+n\'5N-24N^321 


.j,(vJ=|g jstNl’fTN^-ZON^-aMr+aSNV + AN V- ZB Nr +7r%;J 


•I-3N 3 (TN'^-ZON -6N s +38 Ns + 4 NV- Z<3 Ns^ Is 


+ P,)+ iZK% A"Az)]+i8NV3fs>^»i^^4 p;+ ) 

+ iZNVs (N-NV'Zs^)«<^oi^/3*i6^ + (N^-N\-Zr*)<<^*«4.p^ 

+6mVs (3N^ZNV-14Nr+7r^)<2^o^2(8z+(5N^+ZN^5 



(,EOR(,E M I nilLl'TON liROWN 


339 


- 14Ns+ 7s^) /Sj^/SgJ + lZrs (-2N -3N rs+ 12 N'^r 

-i5Nr^5r^)=<j^oi|/3^+ ('-2N -3N fs + 12 N^s -15Ns^ 
+53^)»t^y3^/3/ +12N^rs sf- 2N-r)oC^ + r(- 

+ 2 N - s ) =<.^^ (3^ J + fc) rs ’(3 N I & N V 1 5 N 3 - 1 2 N s S 3 5^ 
+8 rs Pi + (3N V- 6 N V 15 N V - 12 N r\3 

*1 r* 

48 r^ 6 )^|/ 3 ^ + 36 r 3 ("-Srs^- N'^s- N V+ 3 N^s - 2 N ^ 

+ 3NV)c<,^o<^/3^,32+(-5rS-NV- N^5^ + 3NV -ZN’ 

+ 3 N^s)o(,^o 4 ^ P^ J + 9 rs ( 9^^^-22 N^- 36 Nr 5 + 72 rs 
+ 6 N Vs)ct^ 18 r 5 J( 5 r 2 + 7 N^r^- 14 - N V + nV +<aN 

- 25N r\ 43 nV - 2iN^)pcJoc^ + (5 s^7NV- 1 4.1\| 

+ IN 5 +8 29 N 3 V 4 3 - 21 N ^ ) Pj |3^ ^ + 18 rs (N 

+ 4tsl^s+5NVs +5rS -19Nrs)o(,^^-<-^ (^^+(^N^+4N'^r 
+5N rs + 5r6^- l9Nr5)°<2 p^ 

+75 N V 3 N ^r 12 N V^- 1 26 N V +28 N"r +84 
- 32 !N^+ 4 N^-i 28 Nr^l 62 NV- 76 M®)^ 5 ’ + i 5 ( 36 s^ 
-l6NsS75 fi V+ 3 12 N a^-126 N*5 +28N s +84N'^ , 

-32N^ 4H^-i26 N. ^+ 162 N^s -78 N Pz + 6r ("5 r'^-39Nr’ 
4 iOZN^r^+5NV+9NV-56NV^-iZl NV+57Nr 
-6N"r+80N''-4ZN^+8N‘’)o<,^^‘^2+^^ ("55^-39 Ni^ 



,140 OV S.-lMr!JN(, l-KOM iOMPOLlS'D POPUL.i'I IONS 


+ lOEN V-i-3 N V+9N 3b N ^-121 Ns + 57N 

-6N^5 4-60 N'^-4.2N®-)-8N'^)|3^^)S^^6r5('72r^+66N^-5J 

+75 N - 36 Nr^-U8 Nr + 6 N 11 IZ N V)o6^ (5^ 

+6r3(723^+ b6N ^-51N % 75 N^s - 36 N 5^-126 N s + 6N 3* 
+ ilN^- 12 

|r[^^(N-3)+N(N^-H-5)]^^4^^+5 [r^(N-3) 

+N (N^- N -r)] 15^% rs [s(3-N)o;2 ] 

+ r ajjCN - 2)(r-s) + Zs] .<^^.2 Pj +|(N-z)( 5 -r) +Z r I 

- Pi P2+ s-^j, P/ )| . 

5 ^^'/^, -~fe js r(13 M V-7N ^Z N V+ 3N'5^4-N V + Z r^ 

-9Nr^)-+foC^+33('l3N’s-7N^ZN s\3N'^-4N’5 
+2 3 9 Ns 3r(6N^+ N‘^+3r^-5N®-Nr^+4-Nr 

-&N r)=<,^+ SsfbN^+N^-as^- 5N^-Ns^+4 N,^s ~8N 5 ) Pj^ 
i-brs(2r^-4Nr+N‘T + 2N^-N^^-<i4/Pi +6rs (Zs^ 

- 4 Ns + N % + Z N "■- P^ P^+ ZN rs ) 

- ^ 1 + ^\PI Pz)'(5 V/ PA) 



ChOHCl'. MIODLLI i)N y/A^C^/rA/ 


341 


■t(r=4^(3^+ sa;^(3j^ )|-t-3r5 2f4-r3'N^ (3^ 0^ 4- 2(2r-5Mr 

+ 2 N ^ N V) p2+ 2 ( Zs^- 5Kb f EN ^4- Nh {5^p^ 

+(N^- £Ns+3rs-i-3^X^0A (N-Z,Nri-5r& 

+ (9 r . 6 N 4- 4 W 3 N r ) -h (9 s arf ^ 4^!Si 5 N 5)<«^ (3^"^ j 
v^) = ^ |k| r (6 r 2 b N r i- 57 N 4- 4 N ^ r^- a N V - ZO N ^ 

+ 7 N '’')"<;^V^^4- N s (63^- 2b N s\ 37rN S 4- 4 N ^5^- 8 N \ 
-Z0r'lA7N'^)p^^(3j4-r(75Nr^-i07NV + 56N^- 19 
+ 6lNn^-50NV^ 54 N V 4 - 3 K V^- bNV-4Z N 


h 5(75M3^- i07N^5 +56N^-i95A6Ns" 
-30N^5^+ 54NS 4-3IN^5^-6K‘’3-4ZN'^4-aN'^^^j3^^ 
h IS rb|(- N V 4 Z N 2 3=^4 4rs) (- FN ^5 4- ZN^- 2 r ^ 

+ 4 r 5 ) J f N 'r 5[( 5 r 0 ^^^ )- (soc^^ p" 

.r<l filo^ 


+Nr3[(-N^liN^5KV-lbr^-26r5)4j_% P^ +f-N- 
+llN^i-5N^i,-i65^'Z8r5)<^0^0^J + 3Nr3[(“N^3 


+3N^-5Np 4 Zr^)o'^^cz|p^^4-(-NV45\S^- 5Ns 
h Z 5 j P ^ pj ] + 5 rs r 5 [(- N ^ r + N 2 N 5 + 4 r 5 ) 
f(N^s^N^-^N^44^5) l^i l^z] ^ ^ ^ ^ 


-2rUf^^/\^r(r4-ZB) ] + 5 r 5 ( 19 r^+4l Kr 

(TNV-i il NSy2N^'’*• 



.U2 ON S/lMl'IJhh, I’KOM i OMPOVNI) POPULATIONS 

^3r3r-19524-4-lN5-23[NSbN3^-17N^54-il 

+ 2N^3-N‘'')c<,2 3i" 5 (’-i9r5 + 6lNrs +7Ns 

-N‘-N r-N^5-+5N^r)o^^^^p| + 3ps('-19r5+ 6Nr5 
+7Mr-N^-N^3-Nr+3NS)o<,j^^ (3^,02 + rs ('3IN^-9N5 
+ 19rp -6Nr +bN^5)ov^^24 ('3N^-9N^4l9r3-ai'Nls 




